跳到主要内容

得物大数据面试SQL-卖家信用分计算

⚠️ 待修正

一、题目背景

这道题来自得物的数据分析岗面试。得物是潮流电商平台,数据分析师需要从海量业务数据中挖掘洞见。

业务场景:风险控制相关需求在日常工作中频繁出现,这类型的SQL题是面试高频考点。

一、题目

得物平台需要为每个卖家计算信用评分,以便对卖家进行分级管理。信用分由多个维度加权计算得出。给定三张表:seller_info(卖家基础信息)、order_record(订单记录)、return_record(退货记录)。

seller_info 卖家基础信息表:

seller_idseller_nameregister_dateverified
S001潮流买手店A2024-01-15yes
S002球鞋之家2024-03-20yes
S003街头潮流馆2024-06-10no
S004SneakerWorld2024-09-05yes
S005潮品汇2024-11-01yes

order_record 订单记录表:

order_idseller_idbuyer_idorder_timeamountstatus
ORD101S001U5012025-03-013500completed
ORD102S001U5022025-03-052200completed
ORD103S001U5032025-03-105800completed
ORD104S001U5042025-03-151500cancelled
ORD105S002U5052025-03-024200completed
ORD106S002U5062025-03-083100completed
ORD107S003U5072025-03-031800completed
ORD108S003U5082025-03-122500completed
ORD109S004U5092025-03-066800completed
ORD110S004U5102025-03-147200completed
ORD111S004U5112025-03-204300completed
ORD112S005U5122025-03-112100completed

return_record 退货记录表:

return_idorder_idseller_idreturn_timereasonreturn_status
RT001ORD103S0012025-03-18尺码不符completed
RT002ORD110S0042025-03-22商品瑕疵completed
RT003ORD111S0042025-03-25假货质疑pending

要求,信用分由以下维度加权计算(满分100分):

评分维度权重计分规则
成交能力40%已完成订单数 ≥5 得满分,否则按比例(实际/5 × 40)
退货控制30%退货率 ≤10% 得满分,每超过1个百分点扣3分,最低0分
店铺资质20%已认证得满分,未认证得10分
客单价水平10%平均客单价 ≥ 全平台平均客单价 得满分,否则按比例

要求:

  1. 计算每个卖家的各维度得分和最终信用分
  2. 按信用分降序排列
  3. 将卖家分为S级(≥90分)、A级(80-89分)、B级(70-79分)、C级(60-69分)、D级(<60分)

三、思路分析

本题是多维度评分计算的综合性题目,核心考察多表关联、聚合统计、条件评分逻辑和加权计算。难度在于各个维度的计算规则不同,需要分别处理后再加权汇总。

维度评分
题目难度⭐⭐⭐⭐
题目清晰度⭐⭐⭐⭐⭐
业务常见度⭐⭐⭐⭐⭐

解题思路:

  1. 成交能力:从 order_record 统计各卖家 completed 状态的订单数,得分 = MIN(订单数/5, 1) × 40
  2. 退货控制:从 return_record 关联订单表统计退货率,退货率 = 退货订单数 / 已完成订单数,得分 = MAX(0, 30 - (退货率-0.1) × 100 × 3)
  3. 店铺资质:从 seller_info 看 verified 字段,yes 得20分,no 得10分
  4. 客单价水平:计算各卖家平均客单价和全平台平均客单价,得分 = MIN(卖家均价/平台均价, 1) × 10
  5. 汇总四个维度得分 = 最终信用分
  6. 根据最终分数用 CASE WHEN 分级

四、逐步推导

步骤1:计算各维度得分

WITH order_stats AS (
SELECT
seller_id,
COUNT(CASE WHEN status = 'completed' THEN 1 END) AS completed_orders,
AVG(CASE WHEN status = 'completed' THEN amount END) AS avg_order_amount
FROM order_record
GROUP BY seller_id
),
return_stats AS (
SELECT
seller_id,
COUNT(*) AS return_cnt
FROM return_record
WHERE return_status = 'completed'
GROUP BY seller_id
),
platform_avg AS (
SELECT AVG(amount) AS platform_avg_amount
FROM order_record
WHERE status = 'completed'
)
SELECT
s.seller_id,
s.seller_name,
s.verified,
COALESCE(o.completed_orders, 0) AS completed_orders,
COALESCE(o.avg_order_amount, 0) AS avg_order_amount,
ROUND(COALESCE(r.return_cnt, 0) * 100.0 / NULLIF(o.completed_orders, 0), 2) AS return_rate_pct,
-- 成交能力(40分)
ROUND(LEAST(COALESCE(o.completed_orders, 0) / 5.0, 1) * 40, 2) AS score_transaction,
-- 退货控制(30分)
ROUND(GREATEST(0, 30 - GREATEST(0,
(COALESCE(r.return_cnt, 0) * 100.0 / NULLIF(o.completed_orders, 0) - 10)
) * 3), 2) AS score_return,
-- 店铺资质(20分)
CASE WHEN s.verified = 'yes' THEN 20 ELSE 10 END AS score_qualification,
-- 客单价水平(10分)
ROUND(LEAST(COALESCE(o.avg_order_amount, 0) / NULLIF(pa.platform_avg_amount, 0), 1) * 10, 2) AS score_avg_price
FROM seller_info s
LEFT JOIN order_stats o ON s.seller_id = o.seller_id
LEFT JOIN return_stats r ON s.seller_id = r.seller_id
CROSS JOIN platform_avg pa
ORDER BY s.seller_id;

执行结果:

seller_idseller_namecompleted_ordersavg_order_amountreturn_rate_pctscore_transactionscore_returnscore_qualificationscore_avg_price
S001潮流买手店A33833.3333.3324.000.002010.00
S002球鞋之家23650.000.0016.0030.00209.52
S003街头潮流馆22150.000.0016.0030.00105.61
S004SneakerWorld36100.0066.6724.000.002010.00
S005潮品汇12100.000.008.0030.00205.48

步骤2:计算最终信用分并分级

WITH score_detail AS (
-- ... (同上一步的 CTE)
)
SELECT
seller_id,
seller_name,
completed_orders,
ROUND(return_rate_pct, 1) AS return_rate,
score_transaction,
score_return,
score_qualification,
score_avg_price,
ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) AS total_score,
CASE
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 90 THEN 'S级'
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 80 THEN 'A级'
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 70 THEN 'B级'
WHEN ROUND(score_transaction + score_return + score_qualification + score_avg_price, 2) >= 60 THEN 'C级'
ELSE 'D级'
END AS seller_grade
FROM score_detail
ORDER BY total_score DESC;

执行结果:

seller_idseller_namecompleted_ordersreturn_ratescore_transactionscore_returnscore_qualificationscore_avg_pricetotal_scoreseller_grade
S003街头潮流馆20.016.0030.00105.6161.61C级
S005潮品汇10.08.0030.00205.4863.48C级
S002球鞋之家20.016.0030.00209.5275.52B级
S001潮流买手店A333.324.000.002010.0054.00D级
S004SneakerWorld366.724.000.002010.0054.00D级

五、常见坑点

坑1:NULL值的隐式处理 — 聚合函数跳过NULL,窗口函数不跳过。WHERE需显式过滤或用COALESCE替代。

坑2:数据类型不一致导致JOIN失效 — string vs int隐式转换可能触发全表扫描,性能骤降。

六、举一反三

  1. 增加时间维度对比:按天/周/月分组,观察指标的趋势和季节性波动

  2. 增加分组维度:按品类/地区/渠道拆解,发现差异化的业务洞察

  3. 设置预警阈值:在WHERE中加阈值判断,自动筛选异常数据

七、知识点总结

考点说明
CASE WHEN 条件聚合灵活实现分段统计、条件计数、标签化处理
多表JOINLEFT JOIN保留主表所有数据,COALESCE处理NULL
GROUP BY + 聚合函数分组聚合是数据分析的基础,配合HAVING筛选分组结果
日期函数DATEDIFF / DATE_ADD / unix_timestamp处理日期运算

八、建表语句和数据插入

点击展开 DDL & DML
CREATE TABLE seller_info (
seller_id VARCHAR(10) PRIMARY KEY,
seller_name VARCHAR(50),
register_date DATE,
verified VARCHAR(10)
);

INSERT INTO seller_info VALUES
('S001', '潮流买手店A', '2024-01-15', 'yes'),
('S002', '球鞋之家', '2024-03-20', 'yes'),
('S003', '街头潮流馆', '2024-06-10', 'no'),
('S004', 'SneakerWorld', '2024-09-05', 'yes'),
('S005', '潮品汇', '2024-11-01', 'yes');

CREATE TABLE order_record (
order_id VARCHAR(10) PRIMARY KEY,
seller_id VARCHAR(10),
buyer_id VARCHAR(10),
order_time DATE,
amount DECIMAL(10,2),
status VARCHAR(20)
);

INSERT INTO order_record VALUES
('ORD101', 'S001', 'U501', '2025-03-01', 3500, 'completed'),
('ORD102', 'S001', 'U502', '2025-03-05', 2200, 'completed'),
('ORD103', 'S001', 'U503', '2025-03-10', 5800, 'completed'),
('ORD104', 'S001', 'U504', '2025-03-15', 1500, 'cancelled'),
('ORD105', 'S002', 'U505', '2025-03-02', 4200, 'completed'),
('ORD106', 'S002', 'U506', '2025-03-08', 3100, 'completed'),
('ORD107', 'S003', 'U507', '2025-03-03', 1800, 'completed'),
('ORD108', 'S003', 'U508', '2025-03-12', 2500, 'completed'),
('ORD109', 'S004', 'U509', '2025-03-06', 6800, 'completed'),
('ORD110', 'S004', 'U510', '2025-03-14', 7200, 'completed'),
('ORD111', 'S004', 'U511', '2025-03-20', 4300, 'completed'),
('ORD112', 'S005', 'U512', '2025-03-11', 2100, 'completed');

CREATE TABLE return_record (
return_id VARCHAR(10) PRIMARY KEY,
order_id VARCHAR(10),
seller_id VARCHAR(10),
return_time DATE,
reason VARCHAR(50),
return_status VARCHAR(20)
);

INSERT INTO return_record VALUES
('RT001', 'ORD103', 'S001', '2025-03-18', '尺码不符', 'completed'),
('RT002', 'ORD110', 'S004', '2025-03-22', '商品瑕疵', 'completed'),
('RT003', 'ORD111', 'S004', '2025-03-25', '假货质疑', 'pending');
📱关注公众号

「数据仓库技术」文章同步更新,不错过每一篇干货

微信公众号二维码
💬加群交流

备注「数据仓库技术」加入社群,每日一道大厂SQL真题

交流微信二维码

你可能还想看