面试真题
百度
1.股票波峰波谷

百度大数据面试SQL-股票波峰波谷

一、题目

有如下数据,记录每天每只股票的收盘价格,请查出每只股票的波峰和波谷的日期和价格; 波峰:股票价格高于前一天和后一天价格时为波峰 波谷:股票价格低于前一天和后一天价格是为波谷

样例数据

+------------+-------------+--------+
|  ts_code   | trade_date  | close  |
+------------+-------------+--------+
| 000001.SZ  | 20220104    | 16.66  |
| 000002.SZ  | 20220104    | 20.49  |
| 000001.SZ  | 20220105    | 17.15  |
| 000002.SZ  | 20220105    | 21.17  |
| 000001.SZ  | 20220106    | 17.12  |
| 000002.SZ  | 20220106    | 21.05  |
| 000001.SZ  | 20220107    | 17.2   |
| 000002.SZ  | 20220107    | 21.89  |
| 000001.SZ  | 20220110    | 17.19  |
| 000002.SZ  | 20220110    | 22.16  |
| 000001.SZ  | 20220111    | 17.41  |
| 000002.SZ  | 20220111    | 22.3   |
| 000001.SZ  | 20220112    | 17.0   |
| 000002.SZ  | 20220112    | 22.05  |
| 000001.SZ  | 20220113    | 16.98  |
| 000002.SZ  | 20220113    | 21.53  |
| 000001.SZ  | 20220114    | 16.33  |
| 000002.SZ  | 20220114    | 20.7   |
| 000001.SZ  | 20220117    | 16.22  |
| 000002.SZ  | 20220117    | 20.87  |
+------------+-------------+--------+

二、分析

需要比较当天价格与前一天、后一天的价格进行比较,常规想法为进行关联,股票ID相等、日期为当天日期减1,为前一天价格,日期为当天价格加1,为后一天价格,然后进行计算;简化方法为使用lag和lead函数,可以避免进行表关联; lag()函数

LAG(col,n,DEFAULT) 用于统计窗口内往上第n行。参数1为列名,参数2为往上第n行(可选,默认为1),参数3为默认值(当往上第n行为NULL时候,取默认值,如不指定,则为NULL)

lead()函数

LEAD(col,n,DEFAULT) 用于统计窗口内往下第n行。参数1为列名,参数2为往下第n行(可选,默认为1),参数3为默认值(当往下第n行为NULL时候,取默认值,如不指定,则为NULL)

维度评分
题目难度⭐️⭐️⭐️
题目清晰度⭐️⭐️⭐️⭐️⭐️
业务常见度⭐️⭐️⭐️⭐️

三、SQL

1.计算前一天和后一天的价格

我们使用lag函数和lead函数,对每支股票分组,开窗计算出每天股票记录的前一天和后一天记录中的价格。

执行SQL

select
   ts_code,
   trade_date,
   close,
   lag(close,1)over(partition by ts_code order by trade_date asc) as lastday_close,
   lead(close,1)over(partition by ts_code order by trade_date asc) as nextday_close 
   from t1_stock_test;

执行结果

+------------+-------------+--------+----------------+----------------+
|  ts_code   | trade_date  | close  | lastday_close  | nextday_close  |
+------------+-------------+--------+----------------+----------------+
| 000001.SZ  | 20220104    | 16.66  | NULL           | 17.15          |
| 000001.SZ  | 20220105    | 17.15  | 16.66          | 17.12          |
| 000001.SZ  | 20220106    | 17.12  | 17.15          | 17.2           |
| 000001.SZ  | 20220107    | 17.2   | 17.12          | 17.19          |
| 000001.SZ  | 20220110    | 17.19  | 17.2           | 17.41          |
| 000001.SZ  | 20220111    | 17.41  | 17.19          | 17.0           |
| 000001.SZ  | 20220112    | 17.0   | 17.41          | 16.98          |
| 000001.SZ  | 20220113    | 16.98  | 17.0           | 16.33          |
| 000001.SZ  | 20220114    | 16.33  | 16.98          | 16.22          |
| 000001.SZ  | 20220117    | 16.22  | 16.33          | NULL           |
| 000002.SZ  | 20220104    | 20.49  | NULL           | 21.17          |
| 000002.SZ  | 20220105    | 21.17  | 20.49          | 21.05          |
| 000002.SZ  | 20220106    | 21.05  | 21.17          | 21.89          |
| 000002.SZ  | 20220107    | 21.89  | 21.05          | 22.16          |
| 000002.SZ  | 20220110    | 22.16  | 21.89          | 22.3           |
| 000002.SZ  | 20220111    | 22.3   | 22.16          | 22.05          |
| 000002.SZ  | 20220112    | 22.05  | 22.3           | 21.53          |
| 000002.SZ  | 20220113    | 21.53  | 22.05          | 20.7           |
| 000002.SZ  | 20220114    | 20.7   | 21.53          | 20.87          |
| 000002.SZ  | 20220117    | 20.87  | 20.7           | NULL           |
+------------+-------------+--------+----------------+----------------+

2.判断是否为波峰和波谷

如果当天价格大于昨天和明天的价格则是波峰,如果当天价格小于昨天价格和明天的价格则是波谷,不符合条件为其他

执行SQL

select
    ts_code,
    trade_date,
    close,
    lastday_close,
    nextday_close,
    case when  close > lastday_close and close > nextday_close then '波峰'
        when  close < lastday_close and close < nextday_close then '波谷'
        else '其他' end as `point_type`
from
    (
    select
        ts_code,
        trade_date,
        close,
        lag(close,1)over(partition by ts_code order by trade_date asc) as lastday_close,
        lead(close,1)over(partition by ts_code order by trade_date asc) as nextday_close
    from t1_stock_test
    ) t

执行结果

+------------+-------------+--------+----------------+----------------+-------------+
|  ts_code   | trade_date  | close  | lastday_close  | nextday_close  | point_type  |
+------------+-------------+--------+----------------+----------------+-------------+
| 000001.SZ  | 20220104    | 16.66  | NULL           | 17.15          | 其他          |
| 000001.SZ  | 20220105    | 17.15  | 16.66          | 17.12          | 波峰          |
| 000001.SZ  | 20220106    | 17.12  | 17.15          | 17.2           | 波谷          |
| 000001.SZ  | 20220107    | 17.2   | 17.12          | 17.19          | 波峰          |
| 000001.SZ  | 20220110    | 17.19  | 17.2           | 17.41          | 波谷          |
| 000001.SZ  | 20220111    | 17.41  | 17.19          | 17.0           | 波峰          |
| 000001.SZ  | 20220112    | 17.0   | 17.41          | 16.98          | 其他          |
| 000001.SZ  | 20220113    | 16.98  | 17.0           | 16.33          | 其他          |
| 000001.SZ  | 20220114    | 16.33  | 16.98          | 16.22          | 其他          |
| 000001.SZ  | 20220117    | 16.22  | 16.33          | NULL           | 其他          |
| 000002.SZ  | 20220104    | 20.49  | NULL           | 21.17          | 其他          |
| 000002.SZ  | 20220105    | 21.17  | 20.49          | 21.05          | 波峰          |
| 000002.SZ  | 20220106    | 21.05  | 21.17          | 21.89          | 波谷          |
| 000002.SZ  | 20220107    | 21.89  | 21.05          | 22.16          | 其他          |
| 000002.SZ  | 20220110    | 22.16  | 21.89          | 22.3           | 其他          |
| 000002.SZ  | 20220111    | 22.3   | 22.16          | 22.05          | 波峰          |
| 000002.SZ  | 20220112    | 22.05  | 22.3           | 21.53          | 其他          |
| 000002.SZ  | 20220113    | 21.53  | 22.05          | 20.7           | 其他          |
| 000002.SZ  | 20220114    | 20.7   | 21.53          | 20.87          | 波谷          |
| 000002.SZ  | 20220117    | 20.87  | 20.7           | NULL           | 其他          |
+------------+-------------+--------+----------------+----------------+-------------+

3.查询结果

执行SQL

select 
    ts_code,
    trade_date,
    close,
    point_type
from 
    (
    select
        ts_code,
        trade_date,
        close,
        lastday_close,
        nextday_close,
        case when  close > lastday_close and close > nextday_close then '波峰'
            when  close < lastday_close and close < nextday_close then '波谷'
            else '其他' end as `point_type`
    from
        (
        select
            ts_code,
            trade_date,
            close,
            lag(close,1)over(partition by ts_code order by trade_date asc) as lastday_close,
            lead(close,1)over(partition by ts_code order by trade_date asc) as nextday_close
        from t1_stock_test
        ) t
    ) tt
where tt.point_type in('波峰','波谷')

执行结果

+------------+-------------+--------+-------------+
|  ts_code   | trade_date  | close  | point_type  |
+------------+-------------+--------+-------------+
| 000001.SZ  | 20220105    | 17.15  | 波峰          |
| 000001.SZ  | 20220106    | 17.12  | 波谷          |
| 000001.SZ  | 20220107    | 17.2   | 波峰          |
| 000001.SZ  | 20220110    | 17.19  | 波谷          |
| 000001.SZ  | 20220111    | 17.41  | 波峰          |
| 000002.SZ  | 20220105    | 21.17  | 波峰          |
| 000002.SZ  | 20220106    | 21.05  | 波谷          |
| 000002.SZ  | 20220111    | 22.3   | 波峰          |
| 000002.SZ  | 20220114    | 20.7   | 波谷          |
+------------+-------------+--------+-------------+

四、建表语句和数据插入

create table t1_stock_test(
ts_code string comment '股票代码',
trade_date string comment '交易日期',
close float comment '收盘价'
);
 
INSERT INTO `t1_stock_test` VALUES 
('000001.SZ','20220104',16.66),
('000002.SZ','20220104',20.49),
('000001.SZ','20220105',17.15),
('000002.SZ','20220105',21.17),
('000001.SZ','20220106',17.12),
('000002.SZ','20220106',21.05),
('000001.SZ','20220107',17.2),
('000002.SZ','20220107',21.89),
('000001.SZ','20220110',17.19),
('000002.SZ','20220110',22.16),
('000001.SZ','20220111',17.41),
('000002.SZ','20220111',22.3),
('000001.SZ','20220112',17),
('000002.SZ','20220112',22.05),
('000001.SZ','20220113',16.98),
('000002.SZ','20220113',21.53),
('000001.SZ','20220114',16.33),
('000002.SZ','20220114',20.7),
('000001.SZ','20220117',16.22),
('000002.SZ','20220117',20.87);

本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表;