面试真题
常见题目
3.去掉最大最小值的部门平均薪水

常见大数据面试SQL-去掉最大最小值的部门平均薪水

一、题目

有员工薪资表t3_salary,包含员工ID(emp_id),部门ID(depart_id),薪水(salary),请计算去除最高最低薪资后的平均薪水;(每个部门员工数不少于3人)

+---------+------------+-----------+
| emp_id  | depart_id  |  salary   |
+---------+------------+-----------+
| 1001    | 1          | 5000.00   |
| 1002    | 1          | 10000.00  |
| 1003    | 1          | 20000.00  |
| 1004    | 1          | 30000.00  |
| 1005    | 1          | 6000.00   |
| 1006    | 1          | 10000.00  |
| 1007    | 1          | 11000.00  |
| 1008    | 2          | 3000.00   |
| 1009    | 2          | 7000.00   |
| 1010    | 2          | 9000.00   |
| 1011    | 2          | 30000.00  |
+---------+------------+-----------+

二、分析

  1. 找到部门内的最高最低薪资,去掉这些行,考察点为row_number()开窗函数;
  2. 本题的难点在于同时去掉最高和最低,所以需要按照不同的排序进行处理,所以需要进行两次开窗,排序;
  3. 去除最高最低薪资之后,按照部门分组进行分组取平均即可;
维度评分
题目难度⭐️⭐️⭐️
题目清晰度⭐️⭐️⭐️⭐️⭐️
业务常见度⭐️⭐️⭐️⭐️

三、SQL

1.分别按照正序和倒序进行开窗,得到部门最高最低薪资记录

执行SQL

select emp_id,
       depart_id,
       salary,
       row_number() over (partition by depart_id order by salary asc)  as asc_order,
       row_number() over (partition by depart_id order by salary desc) as desc_order
from t3_salary;

查询结果

+---------+------------+-----------+------------+-------------+
| emp_id  | depart_id  |  salary   | asc_order  | desc_order  |
+---------+------------+-----------+------------+-------------+
| 1004    | 1          | 30000.00  | 7          | 1           |
| 1003    | 1          | 20000.00  | 6          | 2           |
| 1007    | 1          | 11000.00  | 5          | 3           |
| 1002    | 1          | 10000.00  | 4          | 4           |
| 1006    | 1          | 10000.00  | 3          | 5           |
| 1005    | 1          | 6000.00   | 2          | 6           |
| 1001    | 1          | 5000.00   | 1          | 7           |
| 1011    | 2          | 30000.00  | 4          | 1           |
| 1010    | 2          | 9000.00   | 3          | 2           |
| 1009    | 2          | 7000.00   | 2          | 3           |
| 1008    | 2          | 3000.00   | 1          | 4           |
+---------+------------+-----------+------------+-------------+

2.去掉最高最低薪资,这里只能有第一行,不能取排序最大值进行去重,那样处理起来太复杂(不能确定排序最大值是多少,这也是进行两次开窗排序的原因)

执行SQL

select emp_id,
       depart_id,
       salary,
       asc_order,
       desc_order
from (select emp_id,
             depart_id,
             salary,
             row_number() over (partition by depart_id order by salary asc)  as asc_order,
             row_number() over (partition by depart_id order by salary desc) as desc_order
      from t3_salary) t
where asc_order > 1
  and desc_order > 1

查询结果

+---------+------------+-----------+------------+-------------+
| emp_id  | depart_id  |  salary   | asc_order  | desc_order  |
+---------+------------+-----------+------------+-------------+
| 1003    | 1          | 20000.00  | 6          | 2           |
| 1007    | 1          | 11000.00  | 5          | 3           |
| 1002    | 1          | 10000.00  | 4          | 4           |
| 1006    | 1          | 10000.00  | 3          | 5           |
| 1005    | 1          | 6000.00   | 2          | 6           |
| 1010    | 2          | 9000.00   | 3          | 2           |
| 1009    | 2          | 7000.00   | 2          | 3           |
+---------+------------+-----------+------------+-------------+

3.查询部门平均薪水

执行SQL

select depart_id,
       avg(salary) as avg_salary
from (select emp_id,
             depart_id,
             salary,
             row_number() over (partition by depart_id order by salary asc)  as asc_order,
             row_number() over (partition by depart_id order by salary desc) as desc_order
      from t3_salary) t
where asc_order > 1
  and desc_order > 1
group by depart_id

查询结果

+------------+---------------+
| depart_id  |  avg_salary   |
+------------+---------------+
| 1          | 11400.000000  |
| 2          | 8000.000000   |
+------------+---------------+

引伸:

1.题目中每个部门员工>=3人,则去除最高最低之后至少还有1人,如果部门人员为2人或者1人该如何处理?

2.如果最高和最低薪资不止一人,该如何处理?(row_number、rank究竟用哪个)

四、建表语句和数据插入

--建表语句
CREATE TABLE t3_salary (
  emp_id bigint,
  depart_id bigint,
  salary decimal(16,2)
);
--插入数据
insert into t3_salary (emp_id,depart_id,salary)
values
(1001,1,5000.00),
(1002,1,10000.00),
(1003,1,20000.00),
(1004,1,30000.00),
(1005,1,6000.00),
(1006,1,10000.00),
(1007,1,11000.00),
(1008,2,3000.00),
(1009,2,7000.00),
(1010,2,9000.00),
(1011,2,30000.00);

本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表;