面试真题
百度
4.连续签到领金币

百度大数据面试SQL-连续签到领金币

一、题目

有用户签到记录表,t4_coin_signin,记录用户当天是否完成签到,请计算出每个用户的每个月获得的金币数量;

签到领金币规则如下:

  1. 用户签到获得1金币;
  2. 如果用户连续签到3天则第三天获得2金币,如果用户连续签到7天则第7天获得5金币;
  3. 连续签到7天后连续天数重置,每月签到天数重置;

样例数据

+----------+--------------+----------+
| user_id  | signin_date  | is_sign  |
+----------+--------------+----------+
| 001      | 2024-01-01   | 1        |
| 001      | 2024-01-02   | 1        |
| 001      | 2024-01-03   | 1        |
| 001      | 2024-01-04   | 0        |
| 001      | 2024-01-05   | 1        |
| 001      | 2024-01-06   | 1        |
| 001      | 2024-01-07   | 1        |
| 001      | 2024-01-08   | 1        |
| 001      | 2024-01-09   | 1        |
| 001      | 2024-01-10   | 1        |
| 001      | 2024-01-11   | 1        |
| 001      | 2024-01-12   | 1        |
| 001      | 2024-01-13   | 1        |
| 001      | 2024-01-14   | 1        |
| 001      | 2024-01-15   | 1        |
| 001      | 2024-01-16   | 1        |
| 001      | 2024-01-17   | 1        |
| 001      | 2024-01-18   | 1        |
| 001      | 2024-01-19   | 1        |
| 001      | 2024-01-20   | 0        |
| 001      | 2024-01-21   | 1        |
| 001      | 2024-01-22   | 1        |
| 001      | 2024-01-23   | 1        |
| 001      | 2024-01-24   | 0        |
| 001      | 2024-01-25   | 1        |
| 001      | 2024-01-26   | 1        |
| 001      | 2024-01-27   | 1        |
| 001      | 2024-01-28   | 1        |
| 001      | 2024-01-29   | 0        |
| 001      | 2024-01-30   | 1        |
| 001      | 2024-01-31   | 1        |
| 001      | 2024-02-01   | 1        |
| 001      | 2024-02-02   | 1        |
| 001      | 2024-02-03   | 1        |
| 001      | 2024-02-04   | 1        |
| 001      | 2024-02-05   | 1        |
| 001      | 2024-02-06   | 1        |
| 001      | 2024-02-07   | 1        |
| 001      | 2024-02-08   | 1        |
| 001      | 2024-02-09   | 1        |
| 001      | 2024-02-10   | 1        |
+----------+--------------+----------+

二、分析

  1. 本题难度较大,但是依旧是连续问题;
  2. 先解决连续问题,然后计算出每天是连续签到第几天;
  3. 处理7天重置问题,得到参与活动的实际连续第几天签到;
  4. 计算每天得到的金币数量,然后求和得到每月获得的金币总数;
维度评分
题目难度⭐️⭐️⭐️⭐️⭐
题目清晰度⭐️⭐️⭐️⭐️⭐
业务常见度⭐️⭐️⭐️⭐️

三、SQL

1.根据用户是否签到,判断用户是否连续签到

本题每个用户的日期记录是连续的,给出了当天用户是否签到。我们把签到日期记录为0,未签到日期记录为1。 根据用户、月份进行分组,按照日期排序,得到一个用户连续签到的分组 signin_group。为方便后续处理,增加sign_month 字段。

注意: 这里面的分组数据中包含了用户未签到的日期数据,所以不是标准的连续结果,我们稍后再进行处理。

执行SQL

select user_id,
       signin_date,
       is_sign,
       substr(signin_date, 1, 7)                                                          as sign_month,
       sum(if(is_sign = 1, 0, 1))
           over (partition by user_id,substr(signin_date, 1, 7) order by signin_date asc) as signin_group
from t4_coin_signin

查询结果

+----------+--------------+----------+-------------+---------------+
| user_id  | signin_date  | is_sign  | sign_month  | signin_group  |
+----------+--------------+----------+-------------+---------------+
| 001      | 2024-01-01   | 1        | 2024-01     | 0             |
| 001      | 2024-01-02   | 1        | 2024-01     | 0             |
| 001      | 2024-01-03   | 1        | 2024-01     | 0             |
| 001      | 2024-01-04   | 0        | 2024-01     | 1             |
| 001      | 2024-01-05   | 1        | 2024-01     | 1             |
| 001      | 2024-01-06   | 1        | 2024-01     | 1             |
| 001      | 2024-01-07   | 1        | 2024-01     | 1             |
| 001      | 2024-01-08   | 1        | 2024-01     | 1             |
| 001      | 2024-01-09   | 1        | 2024-01     | 1             |
| 001      | 2024-01-10   | 1        | 2024-01     | 1             |
| 001      | 2024-01-11   | 1        | 2024-01     | 1             |
| 001      | 2024-01-12   | 1        | 2024-01     | 1             |
| 001      | 2024-01-13   | 1        | 2024-01     | 1             |
| 001      | 2024-01-14   | 1        | 2024-01     | 1             |
| 001      | 2024-01-15   | 1        | 2024-01     | 1             |
| 001      | 2024-01-16   | 1        | 2024-01     | 1             |
| 001      | 2024-01-17   | 1        | 2024-01     | 1             |
| 001      | 2024-01-18   | 1        | 2024-01     | 1             |
| 001      | 2024-01-19   | 1        | 2024-01     | 1             |
| 001      | 2024-01-20   | 0        | 2024-01     | 2             |
| 001      | 2024-01-21   | 1        | 2024-01     | 2             |
| 001      | 2024-01-22   | 1        | 2024-01     | 2             |
| 001      | 2024-01-23   | 1        | 2024-01     | 2             |
| 001      | 2024-01-24   | 0        | 2024-01     | 3             |
| 001      | 2024-01-25   | 1        | 2024-01     | 3             |
| 001      | 2024-01-26   | 1        | 2024-01     | 3             |
| 001      | 2024-01-27   | 1        | 2024-01     | 3             |
| 001      | 2024-01-28   | 1        | 2024-01     | 3             |
| 001      | 2024-01-29   | 0        | 2024-01     | 4             |
| 001      | 2024-01-30   | 1        | 2024-01     | 4             |
| 001      | 2024-01-31   | 1        | 2024-01     | 4             |
| 001      | 2024-02-01   | 1        | 2024-02     | 0             |
| 001      | 2024-02-02   | 1        | 2024-02     | 0             |
| 001      | 2024-02-03   | 1        | 2024-02     | 0             |
| 001      | 2024-02-04   | 1        | 2024-02     | 0             |
| 001      | 2024-02-05   | 1        | 2024-02     | 0             |
| 001      | 2024-02-06   | 1        | 2024-02     | 0             |
| 001      | 2024-02-07   | 1        | 2024-02     | 0             |
| 001      | 2024-02-08   | 1        | 2024-02     | 0             |
| 001      | 2024-02-09   | 1        | 2024-02     | 0             |
| 001      | 2024-02-10   | 1        | 2024-02     | 0             |
+----------+--------------+----------+-------------+---------------+

2.计算用户签到日期,是第几天连续

先计算出用户当月实际是第几天连续签到,增加限制用户签到状态。根据按照用户、月份、连续组signin_group 进行分组,按照日期排序,使用count(signin_date) 计算出截止到当前的签到天数。

注意: 注意这里的有where 条件 和 count()开窗的分组条件。

执行SQL

select user_id,
       signin_date,
       is_sign,
       sign_month,
       signin_group,
       count(signin_date)over(partition by user_id,sign_month,signin_group order by signin_date asc) as conn_sign_days
from (select user_id,
             signin_date,
             is_sign,
             substr(signin_date, 1, 7)                                                          as sign_month,
             sum(if(is_sign = 1, 0, 1))
                 over (partition by user_id,substr(signin_date, 1, 7) order by signin_date asc) as signin_group
      from t4_coin_signin) t
where is_sign = 1

查询结果

+----------+--------------+----------+-------------+---------------+-----------------+
| user_id  | signin_date  | is_sign  | sign_month  | signin_group  | conn_sign_days  |
+----------+--------------+----------+-------------+---------------+-----------------+
| 001      | 2024-01-01   | 1        | 2024-01     | 0             | 1               |
| 001      | 2024-01-02   | 1        | 2024-01     | 0             | 2               |
| 001      | 2024-01-03   | 1        | 2024-01     | 0             | 3               |
| 001      | 2024-01-05   | 1        | 2024-01     | 1             | 1               |
| 001      | 2024-01-06   | 1        | 2024-01     | 1             | 2               |
| 001      | 2024-01-07   | 1        | 2024-01     | 1             | 3               |
| 001      | 2024-01-08   | 1        | 2024-01     | 1             | 4               |
| 001      | 2024-01-09   | 1        | 2024-01     | 1             | 5               |
| 001      | 2024-01-10   | 1        | 2024-01     | 1             | 6               |
| 001      | 2024-01-11   | 1        | 2024-01     | 1             | 7               |
| 001      | 2024-01-12   | 1        | 2024-01     | 1             | 8               |
| 001      | 2024-01-13   | 1        | 2024-01     | 1             | 9               |
| 001      | 2024-01-14   | 1        | 2024-01     | 1             | 10              |
| 001      | 2024-01-15   | 1        | 2024-01     | 1             | 11              |
| 001      | 2024-01-16   | 1        | 2024-01     | 1             | 12              |
| 001      | 2024-01-17   | 1        | 2024-01     | 1             | 13              |
| 001      | 2024-01-18   | 1        | 2024-01     | 1             | 14              |
| 001      | 2024-01-19   | 1        | 2024-01     | 1             | 15              |
| 001      | 2024-01-21   | 1        | 2024-01     | 2             | 1               |
| 001      | 2024-01-22   | 1        | 2024-01     | 2             | 2               |
| 001      | 2024-01-23   | 1        | 2024-01     | 2             | 3               |
| 001      | 2024-01-25   | 1        | 2024-01     | 3             | 1               |
| 001      | 2024-01-26   | 1        | 2024-01     | 3             | 2               |
| 001      | 2024-01-27   | 1        | 2024-01     | 3             | 3               |
| 001      | 2024-01-28   | 1        | 2024-01     | 3             | 4               |
| 001      | 2024-01-30   | 1        | 2024-01     | 4             | 1               |
| 001      | 2024-01-31   | 1        | 2024-01     | 4             | 2               |
| 001      | 2024-02-01   | 1        | 2024-02     | 0             | 1               |
| 001      | 2024-02-02   | 1        | 2024-02     | 0             | 2               |
| 001      | 2024-02-03   | 1        | 2024-02     | 0             | 3               |
| 001      | 2024-02-04   | 1        | 2024-02     | 0             | 4               |
| 001      | 2024-02-05   | 1        | 2024-02     | 0             | 5               |
| 001      | 2024-02-06   | 1        | 2024-02     | 0             | 6               |
| 001      | 2024-02-07   | 1        | 2024-02     | 0             | 7               |
| 001      | 2024-02-08   | 1        | 2024-02     | 0             | 8               |
| 001      | 2024-02-09   | 1        | 2024-02     | 0             | 9               |
| 001      | 2024-02-10   | 1        | 2024-02     | 0             | 10              |
+----------+--------------+----------+-------------+---------------+-----------------+

3.处理签到天数

使用mod函数,对conn_sign_days进行处理,每7天重置,这里会得到06的结果,其中16为准确连续天数,0代表第7天,需要特殊处理一下。

执行SQL

select user_id,
       signin_date,
       is_sign,
       sign_month,
       signin_group,
       if(mod(count(signin_date) over (partition by user_id,sign_month,signin_group order by signin_date asc), 7) = 0,
          7, 
          mod(count(signin_date) over (partition by user_id,sign_month,signin_group order by signin_date asc),
                 7)) as conn_sign_days
from (select user_id,
             signin_date,
             is_sign,
             substr(signin_date, 1, 7)                                                          as sign_month,
             sum(if(is_sign = 1, 0, 1))
                 over (partition by user_id,substr(signin_date, 1, 7) order by signin_date asc) as signin_group
      from t4_coin_signin) t
where is_sign = 1

查询结果

+----------+--------------+----------+-------------+---------------+-----------------+
| user_id  | signin_date  | is_sign  | sign_month  | signin_group  | conn_sign_days  |
+----------+--------------+----------+-------------+---------------+-----------------+
| 001      | 2024-01-01   | 1        | 2024-01     | 0             | 1               |
| 001      | 2024-01-02   | 1        | 2024-01     | 0             | 2               |
| 001      | 2024-01-03   | 1        | 2024-01     | 0             | 3               |
| 001      | 2024-01-05   | 1        | 2024-01     | 1             | 1               |
| 001      | 2024-01-06   | 1        | 2024-01     | 1             | 2               |
| 001      | 2024-01-07   | 1        | 2024-01     | 1             | 3               |
| 001      | 2024-01-08   | 1        | 2024-01     | 1             | 4               |
| 001      | 2024-01-09   | 1        | 2024-01     | 1             | 5               |
| 001      | 2024-01-10   | 1        | 2024-01     | 1             | 6               |
| 001      | 2024-01-11   | 1        | 2024-01     | 1             | 7               |
| 001      | 2024-01-12   | 1        | 2024-01     | 1             | 1               |
| 001      | 2024-01-13   | 1        | 2024-01     | 1             | 2               |
| 001      | 2024-01-14   | 1        | 2024-01     | 1             | 3               |
| 001      | 2024-01-15   | 1        | 2024-01     | 1             | 4               |
| 001      | 2024-01-16   | 1        | 2024-01     | 1             | 5               |
| 001      | 2024-01-17   | 1        | 2024-01     | 1             | 6               |
| 001      | 2024-01-18   | 1        | 2024-01     | 1             | 7               |
| 001      | 2024-01-19   | 1        | 2024-01     | 1             | 1               |
| 001      | 2024-01-21   | 1        | 2024-01     | 2             | 1               |
| 001      | 2024-01-22   | 1        | 2024-01     | 2             | 2               |
| 001      | 2024-01-23   | 1        | 2024-01     | 2             | 3               |
| 001      | 2024-01-25   | 1        | 2024-01     | 3             | 1               |
| 001      | 2024-01-26   | 1        | 2024-01     | 3             | 2               |
| 001      | 2024-01-27   | 1        | 2024-01     | 3             | 3               |
| 001      | 2024-01-28   | 1        | 2024-01     | 3             | 4               |
| 001      | 2024-01-30   | 1        | 2024-01     | 4             | 1               |
| 001      | 2024-01-31   | 1        | 2024-01     | 4             | 2               |
| 001      | 2024-02-01   | 1        | 2024-02     | 0             | 1               |
| 001      | 2024-02-02   | 1        | 2024-02     | 0             | 2               |
| 001      | 2024-02-03   | 1        | 2024-02     | 0             | 3               |
| 001      | 2024-02-04   | 1        | 2024-02     | 0             | 4               |
| 001      | 2024-02-05   | 1        | 2024-02     | 0             | 5               |
| 001      | 2024-02-06   | 1        | 2024-02     | 0             | 6               |
| 001      | 2024-02-07   | 1        | 2024-02     | 0             | 7               |
| 001      | 2024-02-08   | 1        | 2024-02     | 0             | 1               |
| 001      | 2024-02-09   | 1        | 2024-02     | 0             | 2               |
| 001      | 2024-02-10   | 1        | 2024-02     | 0             | 3               |
+----------+--------------+----------+-------------+---------------+-----------------+

4.计算每天得到的金币数

根据conn_sign_days 中签到第几天,得出每天应该得到多少金币。

执行SQL

select user_id,
       signin_date,
       is_sign,
       sign_month,
       signin_group,
       conn_sign_days,
       case when conn_sign_days = 3 then 2 when conn_sign_days = 7 then 5 else 1 end as coin_num
from (select user_id,
             signin_date,
             is_sign,
             sign_month,
             signin_group,
             if(mod(count(signin_date) over (partition by user_id,sign_month,signin_group order by signin_date asc),
                    7) = 0,
                7,
                mod(count(signin_date) over (partition by user_id,sign_month,signin_group order by signin_date asc),
                    7)) as conn_sign_days
      from (select user_id,
                   signin_date,
                   is_sign,
                   substr(signin_date, 1, 7)                                                          as sign_month,
                   sum(if(is_sign = 1, 0, 1))
                       over (partition by user_id,substr(signin_date, 1, 7) order by signin_date asc) as signin_group
            from t4_coin_signin) t
      where is_sign = 1) tt

执行结果

+----------+--------------+----------+-------------+---------------+-----------------+-----------+
| user_id  | signin_date  | is_sign  | sign_month  | signin_group  | conn_sign_days  | coin_num  |
+----------+--------------+----------+-------------+---------------+-----------------+-----------+
| 001      | 2024-01-01   | 1        | 2024-01     | 0             | 1               | 1         |
| 001      | 2024-01-02   | 1        | 2024-01     | 0             | 2               | 1         |
| 001      | 2024-01-03   | 1        | 2024-01     | 0             | 3               | 2         |
| 001      | 2024-01-05   | 1        | 2024-01     | 1             | 1               | 1         |
| 001      | 2024-01-06   | 1        | 2024-01     | 1             | 2               | 1         |
| 001      | 2024-01-07   | 1        | 2024-01     | 1             | 3               | 2         |
| 001      | 2024-01-08   | 1        | 2024-01     | 1             | 4               | 1         |
| 001      | 2024-01-09   | 1        | 2024-01     | 1             | 5               | 1         |
| 001      | 2024-01-10   | 1        | 2024-01     | 1             | 6               | 1         |
| 001      | 2024-01-11   | 1        | 2024-01     | 1             | 7               | 5         |
| 001      | 2024-01-12   | 1        | 2024-01     | 1             | 1               | 1         |
| 001      | 2024-01-13   | 1        | 2024-01     | 1             | 2               | 1         |
| 001      | 2024-01-14   | 1        | 2024-01     | 1             | 3               | 2         |
| 001      | 2024-01-15   | 1        | 2024-01     | 1             | 4               | 1         |
| 001      | 2024-01-16   | 1        | 2024-01     | 1             | 5               | 1         |
| 001      | 2024-01-17   | 1        | 2024-01     | 1             | 6               | 1         |
| 001      | 2024-01-18   | 1        | 2024-01     | 1             | 7               | 5         |
| 001      | 2024-01-19   | 1        | 2024-01     | 1             | 1               | 1         |
| 001      | 2024-01-21   | 1        | 2024-01     | 2             | 1               | 1         |
| 001      | 2024-01-22   | 1        | 2024-01     | 2             | 2               | 1         |
| 001      | 2024-01-23   | 1        | 2024-01     | 2             | 3               | 2         |
| 001      | 2024-01-25   | 1        | 2024-01     | 3             | 1               | 1         |
| 001      | 2024-01-26   | 1        | 2024-01     | 3             | 2               | 1         |
| 001      | 2024-01-27   | 1        | 2024-01     | 3             | 3               | 2         |
| 001      | 2024-01-28   | 1        | 2024-01     | 3             | 4               | 1         |
| 001      | 2024-01-30   | 1        | 2024-01     | 4             | 1               | 1         |
| 001      | 2024-01-31   | 1        | 2024-01     | 4             | 2               | 1         |
| 001      | 2024-02-01   | 1        | 2024-02     | 0             | 1               | 1         |
| 001      | 2024-02-02   | 1        | 2024-02     | 0             | 2               | 1         |
| 001      | 2024-02-03   | 1        | 2024-02     | 0             | 3               | 2         |
| 001      | 2024-02-04   | 1        | 2024-02     | 0             | 4               | 1         |
| 001      | 2024-02-05   | 1        | 2024-02     | 0             | 5               | 1         |
| 001      | 2024-02-06   | 1        | 2024-02     | 0             | 6               | 1         |
| 001      | 2024-02-07   | 1        | 2024-02     | 0             | 7               | 5         |
| 001      | 2024-02-08   | 1        | 2024-02     | 0             | 1               | 1         |
| 001      | 2024-02-09   | 1        | 2024-02     | 0             | 2               | 1         |
| 001      | 2024-02-10   | 1        | 2024-02     | 0             | 3               | 2         |
+----------+--------------+----------+-------------+---------------+-----------------+-----------+

5.计算没人每月得到的金币数

直接按照user_id,sign_mont分组,对coin_num求和即可

执行SQL

select user_id,
       sign_month,
       sum(coin_num) as month_coin_num
from (select user_id,
             signin_date,
             is_sign,
             sign_month,
             signin_group,
             conn_sign_days,
             case when conn_sign_days = 3 then 2 when conn_sign_days = 7 then 5 else 1 end as coin_num
      from (select user_id,
                   signin_date,
                   is_sign,
                   sign_month,
                   signin_group,
                   if(mod(count(signin_date)
                                over (partition by user_id,sign_month,signin_group order by signin_date asc),
                          7) = 0,
                      7,
                      mod(count(signin_date)
                                over (partition by user_id,sign_month,signin_group order by signin_date asc),
                          7)) as conn_sign_days
            from (select user_id,
                         signin_date,
                         is_sign,
                         substr(signin_date, 1, 7)                                                          as sign_month,
                         sum(if(is_sign = 1, 0, 1))
                             over (partition by user_id,substr(signin_date, 1, 7) order by signin_date asc) as signin_group
                  from t4_coin_signin) t
            where is_sign = 1) tt) ttt
group by user_id, sign_month

执行结果

+----------+-------------+-----------------+
| user_id  | sign_month  | month_coin_num  |
+----------+-------------+-----------------+
| 001      | 2024-01     | 40              |
| 001      | 2024-02     | 16              |
+----------+-------------+-----------------+

四、建表语句和数据插入

--建表语句
CREATE TABLE t4_coin_signin
(
    user_id     string COMMENT '用户ID',
    signin_date string COMMENT '日期',
    is_sign     bigint COMMENT '是否签到 1-签到,0-未签到'
) COMMENT '签到领金币记录表';
-- 插入数据
insert into t4_coin_signin(user_id, signin_date, is_sign)
values ('001', '2024-01-01', 1),
       ('001', '2024-01-02', 1),
       ('001', '2024-01-03', 1),
       ('001', '2024-01-04', 0),
       ('001', '2024-01-05', 1),
       ('001', '2024-01-06', 1),
       ('001', '2024-01-07', 1),
       ('001', '2024-01-08', 1),
       ('001', '2024-01-09', 1),
       ('001', '2024-01-10', 1),
       ('001', '2024-01-11', 1),
       ('001', '2024-01-12', 1),
       ('001', '2024-01-13', 1),
       ('001', '2024-01-14', 1),
       ('001', '2024-01-15', 1),
       ('001', '2024-01-16', 1),
       ('001', '2024-01-17', 1),
       ('001', '2024-01-18', 1),
       ('001', '2024-01-19', 1),
       ('001', '2024-01-20', 0),
       ('001', '2024-01-21', 1),
       ('001', '2024-01-22', 1),
       ('001', '2024-01-23', 1),
       ('001', '2024-01-24', 0),
       ('001', '2024-01-25', 1),
       ('001', '2024-01-26', 1),
       ('001', '2024-01-27', 1),
       ('001', '2024-01-28', 1),
       ('001', '2024-01-29', 0),
       ('001', '2024-01-30', 1),
       ('001', '2024-01-31', 1),
       ('001', '2024-02-01', 1),
       ('001', '2024-02-02', 1),
       ('001', '2024-02-03', 1),
       ('001', '2024-02-04', 1),
       ('001', '2024-02-05', 1),
       ('001', '2024-02-06', 1),
       ('001', '2024-02-07', 1),
       ('001', '2024-02-08', 1),
       ('001', '2024-02-09', 1),
       ('001', '2024-02-10', 1);

本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表;