面试真题
常见题目
9.所有考试科目的成绩都大于对应学科平均成绩的学生

常见大数据面试SQL-所有考试科目的成绩都大于对应学科平均成绩的学生

一、题目

有学生每科科目成绩,找出所有科目成绩都大于对应学科的平均成绩的学生

+------+------+--------+
| sid  | cid  | score  |
+------+------+--------+
| 1    | 1    | 90     |
| 1    | 2    | 50     |
| 1    | 3    | 72     |
| 2    | 1    | 40     |
| 2    | 2    | 50     |
| 2    | 3    | 22     |
| 3    | 1    | 30     |
| 3    | 2    | 50     |
| 3    | 3    | 52     |
| 4    | 1    | 90     |
| 4    | 2    | 90     |
| 4    | 3    | 72     |
+------+------+--------+

二、分析

题目要求找出每科科目成绩都大于对应学科平均成绩的学生,难点有两个:

  1. 给每行记录(每个学生每个学科)添加该学科的平均成绩,这里开窗函数可以解决;
  2. 查询出“所有”科目都大于平均成绩的学生,这里的所有比较难处理,有个技巧:对每个学生的每个科目满足“成绩>科目平均成绩”的记录打0,不满足的打1,然后对学生所有科目标签求和,和为0则满足“所有科目都大于平均成绩”,和>0则不满足;
维度评分
题目难度⭐️⭐️⭐️
题目清晰度⭐️⭐️⭐️⭐️⭐️
业务常见度⭐️⭐️⭐️⭐️

三、SQL

1.使用开窗函数给每行记录添加对应科目的平均成绩

执行SQL

select sid,
       cid,
       score,
       avg(score) over (partition by cid) as avg_score
from t9_scores

查询结果

+------+------+--------+------------+
| sid  | cid  | score  | avg_score  |
+------+------+--------+------------+
| 4    | 1    | 90     | 62.5       |
| 3    | 1    | 30     | 62.5       |
| 2    | 1    | 40     | 62.5       |
| 1    | 1    | 90     | 62.5       |
| 4    | 2    | 90     | 60.0       |
| 3    | 2    | 50     | 60.0       |
| 2    | 2    | 50     | 60.0       |
| 1    | 2    | 50     | 60.0       |
| 4    | 3    | 72     | 54.5       |
| 3    | 3    | 52     | 54.5       |
| 2    | 3    | 22     | 54.5       |
| 1    | 3    | 72     | 54.5       |
+------+------+--------+------------+

2.给每个学生每个科目进行打标,科目成绩>科目平均成绩的标记为0,反之为1;

执行SQL

select sid,
       cid,
       score,
       avg_score,
       if(score > avg_score, 0, 1) as flag
from (select sid,
             cid,
             score,
             avg(score) over (partition by cid) as avg_score
      from t9_scores) t

查询结果

+------+------+--------+------------+-------+
| sid  | cid  | score  | avg_score  | flag  |
+------+------+--------+------------+-------+
| 4    | 1    | 90     | 62.5       | 0     |
| 3    | 1    | 30     | 62.5       | 1     |
| 2    | 1    | 40     | 62.5       | 1     |
| 1    | 1    | 90     | 62.5       | 0     |
| 4    | 2    | 90     | 60.0       | 0     |
| 3    | 2    | 50     | 60.0       | 1     |
| 2    | 2    | 50     | 60.0       | 1     |
| 1    | 2    | 50     | 60.0       | 1     |
| 4    | 3    | 72     | 54.5       | 0     |
| 3    | 3    | 52     | 54.5       | 1     |
| 2    | 3    | 22     | 54.5       | 1     |
| 1    | 3    | 72     | 54.5       | 0     |
+------+------+--------+------------+-------+

3.使用sum统计flag和为0的学生即满足条件的学生

执行SQL

select sid
from (select sid,
             cid,
             score,
             avg_score,
             if(score > avg_score, 0, 1) as flag
      from (select sid,
                   cid,
                   score,
                   avg(score) over (partition by cid) as avg_score
            from t9_scores) t) tt
group by sid
having sum(flag) = 0

查询结果

+------+
| sid  |
+------+
| 4    |
+------+

四、建表语句和数据插入

--建表语句
CREATE TABLE t9_scores (
sid bigint COMMENT '学生ID',
cid bigint COMMENT '课程ID',
score bigint COMMENT '得分'
) COMMENT '用户课程分数';
-- 插入数据
insert into t9_scores(sid,cid,score)
values
(1,1,90),
(1,2,50),
(1,3,72),
(2,1,40),
(2,2,50),
(2,3,22),
(3,1,30),
(3,2,50),
(3,3,52),
(4,1,90),
(4,2,90),
(4,3,72)

本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表;