常见大数据面试SQL-所有考试科目的成绩都大于对应学科平均成绩的学生
一、题目
有学生每科科目成绩,找出所有科目成绩都大于对应学科的平均成绩的学生
+------+------+--------+
| sid | cid | score |
+------+------+--------+
| 1 | 1 | 90 |
| 1 | 2 | 50 |
| 1 | 3 | 72 |
| 2 | 1 | 40 |
| 2 | 2 | 50 |
| 2 | 3 | 22 |
| 3 | 1 | 30 |
| 3 | 2 | 50 |
| 3 | 3 | 52 |
| 4 | 1 | 90 |
| 4 | 2 | 90 |
| 4 | 3 | 72 |
+------+------+--------+
二、分析
题目要求找出每科科目成绩都大于对应学科平均成绩的学生,难点有两个:
- 给每行记录(每个学生每个学科)添加该学科的平均成绩,这里开窗函数可以解决;
- 查询出“所有”科目都大于平均成绩的学生,这里的所有比较难处理,有个技巧:对每个学生的每个科目满足“成绩>科目平均成绩”的记录打0,不满足的打1,然后对学生所有科目标签求和,和为0则满足“所有科目都大于平均成绩”,和>0则不满足;
维度 | 评分 |
---|---|
题目难度 | ⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐️ |
业务常见度 | ⭐️⭐️⭐️⭐️ |
三、SQL
1.使用开窗函数给每行记录添加对应科目的平均成绩
执行SQL
select sid,
cid,
score,
avg(score) over (partition by cid) as avg_score
from t9_scores
查询结果
+------+------+--------+------------+
| sid | cid | score | avg_score |
+------+------+--------+------------+
| 4 | 1 | 90 | 62.5 |
| 3 | 1 | 30 | 62.5 |
| 2 | 1 | 40 | 62.5 |
| 1 | 1 | 90 | 62.5 |
| 4 | 2 | 90 | 60.0 |
| 3 | 2 | 50 | 60.0 |
| 2 | 2 | 50 | 60.0 |
| 1 | 2 | 50 | 60.0 |
| 4 | 3 | 72 | 54.5 |
| 3 | 3 | 52 | 54.5 |
| 2 | 3 | 22 | 54.5 |
| 1 | 3 | 72 | 54.5 |
+------+------+--------+------------+
2.给每个学生每个科目进行打标,科目成绩>科目平均成绩的标记为0,反之为1;
执行SQL
select sid,
cid,
score,
avg_score,
if(score > avg_score, 0, 1) as flag
from (select sid,
cid,
score,
avg(score) over (partition by cid) as avg_score
from t9_scores) t
查询结果
+------+------+--------+------------+-------+
| sid | cid | score | avg_score | flag |
+------+------+--------+------------+-------+
| 4 | 1 | 90 | 62.5 | 0 |
| 3 | 1 | 30 | 62.5 | 1 |
| 2 | 1 | 40 | 62.5 | 1 |
| 1 | 1 | 90 | 62.5 | 0 |
| 4 | 2 | 90 | 60.0 | 0 |
| 3 | 2 | 50 | 60.0 | 1 |
| 2 | 2 | 50 | 60.0 | 1 |
| 1 | 2 | 50 | 60.0 | 1 |
| 4 | 3 | 72 | 54.5 | 0 |
| 3 | 3 | 52 | 54.5 | 1 |
| 2 | 3 | 22 | 54.5 | 1 |
| 1 | 3 | 72 | 54.5 | 0 |
+------+------+--------+------------+-------+
3.使用sum统计flag和为0的学生即满足条件的学生
执行SQL
select sid
from (select sid,
cid,
score,
avg_score,
if(score > avg_score, 0, 1) as flag
from (select sid,
cid,
score,
avg(score) over (partition by cid) as avg_score
from t9_scores) t) tt
group by sid
having sum(flag) = 0
查询结果
+------+
| sid |
+------+
| 4 |
+------+
四、建表语句和数据插入
--建表语句
CREATE TABLE t9_scores (
sid bigint COMMENT '学生ID',
cid bigint COMMENT '课程ID',
score bigint COMMENT '得分'
) COMMENT '用户课程分数';
-- 插入数据
insert into t9_scores(sid,cid,score)
values
(1,1,90),
(1,2,50),
(1,3,72),
(2,1,40),
(2,2,50),
(2,3,22),
(3,1,30),
(3,2,50),
(3,3,52),
(4,1,90),
(4,2,90),
(4,3,72)
本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表;