得物大数据面试SQL-卖家信用分计算
⚠️ 待修正
一、题目背景
这道题来自得物的数据分析岗面试。得物是潮流电商平台,数据分析师需要从海量业务数据中挖掘洞见。
业务场景:风险控制相关需求在日常工作中频繁出现,这类型的SQL题是面试高频考点。
一、题目
得物平台需要为每个卖家计算信用评分,以便对卖家进行分级管理。信用分由多个维度加权计算得出。给定三张表:seller_info(卖家基础信息)、order_record(订单记录)、return_record(退货记录)。
seller_info 卖家基础信息表:
| seller_id | seller_name | register_date | verified |
|---|---|---|---|
| S001 | 潮流买手店A | 2024-01-15 | yes |
| S002 | 球鞋之家 | 2024-03-20 | yes |
| S003 | 街头潮流馆 | 2024-06-10 | no |
| S004 | SneakerWorld | 2024-09-05 | yes |
| S005 | 潮品汇 | 2024-11-01 | yes |
order_record 订单记录表:
| order_id | seller_id | buyer_id | order_time | amount | status |
|---|---|---|---|---|---|
| ORD101 | S001 | U501 | 2025-03-01 | 3500 | completed |
| ORD102 | S001 | U502 | 2025-03-05 | 2200 | completed |
| ORD103 | S001 | U503 | 2025-03-10 | 5800 | completed |
| ORD104 | S001 | U504 | 2025-03-15 | 1500 | cancelled |
| ORD105 | S002 | U505 | 2025-03-02 | 4200 | completed |
| ORD106 | S002 | U506 | 2025-03-08 | 3100 | completed |
| ORD107 | S003 | U507 | 2025-03-03 | 1800 | completed |
| ORD108 | S003 | U508 | 2025-03-12 | 2500 | completed |
| ORD109 | S004 | U509 | 2025-03-06 | 6800 | completed |
| ORD110 | S004 | U510 | 2025-03-14 | 7200 | completed |
| ORD111 | S004 | U511 | 2025-03-20 | 4300 | completed |
| ORD112 | S005 | U512 | 2025-03-11 | 2100 | completed |
return_record 退货记录表:
| return_id | order_id | seller_id | return_time | reason | return_status |
|---|---|---|---|---|---|
| RT001 | ORD103 | S001 | 2025-03-18 | 尺码不符 | completed |
| RT002 | ORD110 | S004 | 2025-03-22 | 商品瑕疵 | completed |
| RT003 | ORD111 | S004 | 2025-03-25 | 假货质疑 | pending |
要求,信用分由以下维度加权计算(满分100分):
| 评分维度 | 权重 | 计分规则 |
|---|---|---|
| 成交能力 | 40% | 已完成订单数 ≥5 得满分,否则按比例(实际/5 × 40) |
| 退货控制 | 30% | 退货率 ≤10% 得满分,每超过1个百分点扣3分,最低0分 |
| 店铺资质 | 20% | 已认证得满分,未认证得10分 |
| 客单价水平 | 10% | 平均客单价 ≥ 全平台平均客单价 得满分,否则按比例 |
要求:
- 计算每个卖家的各维度得分和最终信用分
- 按信用分降序排列
- 将卖家分为S级(≥90分)、A级(80-89分)、B级(70-79分)、C级(60-69分)、D级(<60分)
三、思路分析
本题是多维度评分计算的综合性题目,核心考察多表关联、聚合统计、条件评分逻辑和加权计算。难度在于各个维度的计算规则不同,需要分别处理后再加权汇总。
| 维度 | 评分 |
|---|---|
| 题目难度 | ⭐⭐⭐⭐ |
| 题目清晰度 | ⭐⭐⭐⭐⭐ |
| 业务常见度 | ⭐⭐⭐⭐⭐ |
解题思路:
- 成交能力:从 order_record 统计各卖家 completed 状态的订单数,得分 = MIN(订单数/5, 1) × 40
- 退货控制:从 return_record 关联订单表统计退货率,退货率 = 退货订单数 / 已完成订单数,得分 = MAX(0, 30 - (退货率-0.1) × 100 × 3)
- 店铺资质:从 seller_info 看 verified 字段,yes 得20分,no 得10分
- 客单价水平:计算各卖家平均客单价和全平台平均客单价,得分 = MIN(卖家均价/平台均价, 1) × 10
- 汇总四个维度得分 = 最终信用分
- 根据最终分数用 CASE WHEN 分级
四、逐步推导
步骤1:计算各维度得分
WITH order_stats AS (
SELECT
seller_id,
COUNT(CASE WHEN status = 'completed' THEN 1 END) AS completed_orders,
AVG(CASE WHEN status = 'completed' THEN amount END) AS avg_order_amount
FROM order_record
GROUP BY seller_id
),
return_stats AS (
SELECT
seller_id,
COUNT(*) AS return_cnt
FROM return_record
WHERE return_status = 'completed'
GROUP BY seller_id
),
platform_avg AS (
SELECT AVG(amount) AS platform_avg_amount
FROM order_record
WHERE status = 'completed'
)
SELECT
s.seller_id,
s.seller_name,
s.verified,
COALESCE(o.completed_orders, 0) AS completed_orders,
COALESCE(o.avg_order_amount, 0) AS avg_order_amount,
ROUND(COALESCE(r.return_cnt, 0) * 100.0 / NULLIF(o.completed_orders, 0), 2) AS return_rate_pct,
-- 成交能力(40分)
ROUND(LEAST(COALESCE(o.completed_orders, 0) / 5.0, 1) * 40, 2) AS score_transaction,
-- 退货控制(30分)
ROUND(GREATEST(0, 30 - GREATEST(0,
(COALESCE(r.return_cnt, 0) * 100.0 / NULLIF(o.completed_orders, 0) - 10)
) * 3), 2) AS score_return,
-- 店铺资质(20分)
CASE WHEN s.verified = 'yes' THEN 20 ELSE 10 END AS score_qualification,
-- 客单价水平(10分)
ROUND(LEAST(COALESCE(o.avg_order_amount, 0) / NULLIF(pa.platform_avg_amount, 0), 1) * 10, 2) AS score_avg_price
FROM seller_info s
LEFT JOIN order_stats o ON s.seller_id = o.seller_id
LEFT JOIN return_stats r ON s.seller_id = r.seller_id
CROSS JOIN platform_avg pa
ORDER BY s.seller_id;
执行结果:
| seller_id | seller_name | completed_orders | avg_order_amount | return_rate_pct | score_transaction | score_return | score_qualification | score_avg_price |
|---|---|---|---|---|---|---|---|---|
| S001 | 潮流买手店A | 3 | 3833.33 | 33.33 | 24.00 | 0.00 | 20 | 10.00 |
| S002 | 球鞋之家 | 2 | 3650.00 | 0.00 | 16.00 | 30.00 | 20 | 9.52 |
| S003 | 街头潮流馆 | 2 | 2150.00 | 0.00 | 16.00 | 30.00 | 10 | 5.61 |
| S004 | SneakerWorld | 3 | 6100.00 | 66.67 | 24.00 | 0.00 | 20 | 10.00 |
| S005 | 潮品汇 | 1 | 2100.00 | 0.00 | 8.00 | 30.00 | 20 | 5.48 |
步骤2:计算最终信用分并分级
WITH score_detail AS (
-- ... (同上一步的 CTE)
)
SELECT
seller_id,
seller_name,
completed_orders,
ROUND(return_rate_pct, 1) AS return_rate,
score_transaction,
score_return,
score_qualification,
score_avg_price,
ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) AS total_score,
CASE
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 90 THEN 'S级'
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 80 THEN 'A级'
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 70 THEN 'B级'
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 60 THEN 'C级'
ELSE 'D级'
END AS seller_grade
FROM score_detail
ORDER BY total_score DESC;
执行结果:
| seller_id | seller_name | completed_orders | return_rate | score_transaction | score_return | score_qualification | score_avg_price | total_score | seller_grade |
|---|---|---|---|---|---|---|---|---|---|
| S003 | 街头潮流馆 | 2 | 0.0 | 16.00 | 30.00 | 10 | 5.61 | 61.61 | C级 |
| S005 | 潮品汇 | 1 | 0.0 | 8.00 | 30.00 | 20 | 5.48 | 63.48 | C级 |
| S002 | 球鞋之家 | 2 | 0.0 | 16.00 | 30.00 | 20 | 9.52 | 75.52 | B级 |
| S001 | 潮流买手店A | 3 | 33.3 | 24.00 | 0.00 | 20 | 10.00 | 54.00 | D级 |
| S004 | SneakerWorld | 3 | 66.7 | 24.00 | 0.00 | 20 | 10.00 | 54.00 | D级 |
五、常见坑点
坑1:NULL值的隐式处理 — 聚合函数跳过NULL,窗口函数不跳过。WHERE需显式过滤或用COALESCE替代。
坑2:数据类型不一致导致JOIN失效 — string vs int隐式转换可能触发全表扫描,性能骤降。
六、举一反三
-
增加时间维度对比:按天/周/月分组,观察指标的趋势和季节性波动
-
增加分组维度:按品类/地区/渠道拆解,发现差异化的业务洞察
-
设置预警阈值:在WHERE中加阈值判断,自动筛选异常数据
七、知识点总结
| 考点 | 说明 |
|---|---|
| CASE WHEN 条件聚合 | 灵活实现分段统计、条件计数、标签化处理 |
| 多表JOIN | LEFT JOIN保留主表所有数据,COALESCE处理NULL |
| GROUP BY + 聚合函数 | 分组聚合是数据分析的基础,配合HAVING筛选分组结果 |
| 日期函数 | DATEDIFF / DATE_ADD / unix_timestamp处理日期运算 |
八、建表语句和数据插入
点击展开 DDL & DML
CREATE TABLE seller_info (
seller_id VARCHAR(10) PRIMARY KEY,
seller_name VARCHAR(50),
register_date DATE,
verified VARCHAR(10)
);
INSERT INTO seller_info VALUES
('S001', '潮流买手店A', '2024-01-15', 'yes'),
('S002', '球鞋之家', '2024-03-20', 'yes'),
('S003', '街头潮流馆', '2024-06-10', 'no'),
('S004', 'SneakerWorld', '2024-09-05', 'yes'),
('S005', '潮品汇', '2024-11-01', 'yes');
CREATE TABLE order_record (
order_id VARCHAR(10) PRIMARY KEY,
seller_id VARCHAR(10),
buyer_id VARCHAR(10),
order_time DATE,
amount DECIMAL(10,2),
status VARCHAR(20)
);
INSERT INTO order_record VALUES
('ORD101', 'S001', 'U501', '2025-03-01', 3500, 'completed'),
('ORD102', 'S001', 'U502', '2025-03-05', 2200, 'completed'),
('ORD103', 'S001', 'U503', '2025-03-10', 5800, 'completed'),
('ORD104', 'S001', 'U504', '2025-03-15', 1500, 'cancelled'),
('ORD105', 'S002', 'U505', '2025-03-02', 4200, 'completed'),
('ORD106', 'S002', 'U506', '2025-03-08', 3100, 'completed'),
('ORD107', 'S003', 'U507', '2025-03-03', 1800, 'completed'),
('ORD108', 'S003', 'U508', '2025-03-12', 2500, 'completed'),
('ORD109', 'S004', 'U509', '2025-03-06', 6800, 'completed'),
('ORD110', 'S004', 'U510', '2025-03-14', 7200, 'completed'),
('ORD111', 'S004', 'U511', '2025-03-20', 4300, 'completed'),
('ORD112', 'S005', 'U512', '2025-03-11', 2100, 'completed');
CREATE TABLE return_record (
return_id VARCHAR(10) PRIMARY KEY,
order_id VARCHAR(10),
seller_id VARCHAR(10),
return_time DATE,
reason VARCHAR(50),
return_status VARCHAR(20)
);
INSERT INTO return_record VALUES
('RT001', 'ORD103', 'S001', '2025-03-18', '尺码不符', 'completed'),
('RT002', 'ORD110', 'S004', '2025-03-22', '商品瑕疵', 'completed'),
('RT003', 'ORD111', 'S004', '2025-03-25', '假货质疑', 'pending');
📱关注公众号
「数据仓库技术」文章同步更新,不错过每一篇干货

💬加群交流
备注「数据仓库技术」加入社群,每日一道大厂SQL真题
