meta大数据面试SQL-计算每个用户的受欢迎程度

一、题目

有好友关系表t1_friend，记录了user1_id,user2_id的好友关系对。现定义用户受欢迎程度=用户拥有的朋友总数/平台上的用户总数,请计算出每个用户的受欢迎程度。

+-----------+-----------+
| user1_id  | user2_id  |
+-----------+-----------+
| 1         | 2         |
| 1         | 3         |
| 1         | 4         |
| 1         | 5         |
| 2         | 3         |
| 2         | 4         |
| 3         | 4         |
| 4         | 5         |
| 5         | 6         |
| 5         | 7         |
| 7         | 8         |
| 9         | 10        |
+-----------+-----------+

二、分析

题目中数据user1_id,user2_id为互为好友关系，为关系对，即1与2是好友关系，则1-2,2-1记录只会存在一条，为方便计算，我们需要有两条记录。所以将user2_id与user1_id 互换，然后与原表进行union all;
对union all后的数据，按照user1_id分组，统计user2_id的个数，即user1_id 的好友数据,使用开窗计算出用户总数；

维度	评分
题目难度	⭐️⭐️⭐️
题目清晰度	⭐️⭐️⭐️⭐️⭐️
业务常见度	⭐️⭐️⭐️⭐️

三、SQL

1.user1_id,user2_id互换，然后进行union all。

执行SQL

select user1_id,
       user2_id
from t1_friend
union all
select user2_id,
       user1_id
from t1_friend

查询结果

+---------------+---------------+
| _u1.user1_id  | _u1.user2_id  |
+---------------+---------------+
| 1             | 2             |
| 2             | 1             |
| 1             | 3             |
| 3             | 1             |
| 1             | 4             |
| 4             | 1             |
| 1             | 5             |
| 5             | 1             |
| 2             | 3             |
| 3             | 2             |
| 2             | 4             |
| 4             | 2             |
| 3             | 4             |
| 4             | 3             |
| 4             | 5             |
| 5             | 4             |
| 5             | 6             |
| 6             | 5             |
| 5             | 7             |
| 7             | 5             |
| 7             | 8             |
| 8             | 7             |
| 9             | 10            |
| 10            | 9             |
+---------------+---------------+

2.计算每个用户的好友数,开窗计算出总用户数

执行SQL

with tmp as
         (select user1_id,
                 user2_id
          from t1_friend
          union all
          select user2_id,
                 user1_id
          from t1_friend)
select user1_id, 
       count(user2_id) as friend_cnt, 
       count(distinct user1_id) over () as total_cnt
from tmp
group by user1_id

查询结果

+-----------+-------------+------------+
| user1_id  | friend_cnt  | total_cnt  |
+-----------+-------------+------------+
| 10        | 1           | 10         |
| 9         | 1           | 10         |
| 8         | 1           | 10         |
| 7         | 2           | 10         |
| 6         | 1           | 10         |
| 5         | 4           | 10         |
| 4         | 4           | 10         |
| 3         | 3           | 10         |
| 2         | 3           | 10         |
| 1         | 4           | 10         |
+-----------+-------------+------------+

3.用户好友数除以总用户数计算最终结果

with tmp as
         (select user1_id,
                 user2_id
          from t1_friend
          union all
          select user2_id,
                 user1_id
          from t1_friend)
select user1_id,
       count(user2_id) /
       count(distinct user1_id) over () res
from tmp
group by user1_id

查询结果

+-----------+------+
| user1_id  | res  |
+-----------+------+
| 10        | 0.1  |
| 9         | 0.1  |
| 8         | 0.1  |
| 7         | 0.2  |
| 6         | 0.1  |
| 5         | 0.4  |
| 4         | 0.4  |
| 3         | 0.3  |
| 2         | 0.3  |
| 1         | 0.4  |
+-----------+------+

四、建表语句和数据插入

--建表语句
CREATE TABLE t1_friend(
user1_id bigint COMMENT '用户1ID',
user2_id bigint COMMENT '用户2ID'
) COMMENT '好友关系表';
-- 插入数据
insert into t1_friend(user1_id,user2_id)
values
(1,2),
(1,3),
(1,4),
(1,5),
(2,3),
(2,4),
(3,4),
(4,5),
(5,6),
(5,7),
(7,8),
(9,10)

本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术 (opens in a new tab)“发表；
欢迎扫码关注公众号，添加博主好友加入数据仓库技术交流群；
如果觉得本文对您有帮助，欢迎扫码打赏

微信扫码备注「数据仓库」
加入数据仓库技术交流群

微信关注公众号「数据仓库技术」

微信扫码打赏

支付宝扫码打赏

1.合并日期重叠的活动 1.每年成绩都有所提升的学生