用户分层分析核心是依行为或价值指标划分等级,SQL实现需先计算用户指标再用CASE WHEN或NTILE等函数分级,推荐固定规则用于RFM/VIP,分布百分比适用于无明确阈值场景,并建议建宽表、定时更新、标注时效性。

用户分层分析的核心是用行为或价值指标把用户划成不同等级,SQL 实现的关键在于:先算出每个用户的指标(如最近购买天数、总消费、订单数),再用 CASE WHEN 或 窗口函数 + NTILE 做等级划分,避免硬编码、兼顾可维护性和业务语义。
一、按固定规则分层(推荐用于RFM、VIP等级)
适合有明确业务定义的等级,比如“近30天活跃且消费≥500元为VIP1,近7天活跃且消费≥2000元为VIP2”。用 CASE WHEN 直接映射,逻辑清晰、易审核:
SELECT
user_id,
total_amount,
last_order_days,
CASE
WHEN last_order_days <= 7 AND total_amount >= 2000 THEN 'VIP2'
WHEN last_order_days <= 30 AND total_amount >= 500 THEN 'VIP1'
WHEN last_order_days <= 90 AND total_amount > 0 THEN '普通活跃'
ELSE '沉默用户'
END AS user_level
FROM (
SELECT
user_id,
COALESCE(SUM(price), 0) AS total_amount,
DATEDIFF(CURDATE(), MAX(order_time)) AS last_order_days
FROM orders
GROUP BY user_id
) t;
二、按分布百分比分层(适合无明确阈值的场景)
当业务没定死“多少算高价值”,而是想取前10%为S级、中间60%为A级——用 NTILE(10) 或 PERCENT_RANK() 更公平:
- NTILE(10) 把用户平均分成10组(注意:数据量少时可能不均)
- PERCENT_RANK() 返回0~1之间的相对排名,再用CASE分段更精准
示例(按总消费分4档):
SELECT
user_id,
total_amount,
CASE
WHEN prk <= 0.1 THEN 'S(Top10%)'
WHEN prk <= 0.5 THEN 'A(Top50%)'
WHEN prk <= 0.9 THEN 'B(长尾)'
ELSE 'C(末尾10%)'
END AS level_by_rank
FROM (
SELECT
user_id,
SUM(price) AS total_amount,
PERCENT_RANK() OVER (ORDER BY SUM(price) DESC) AS prk
FROM orders
GROUP BY user_id
) t;
三、动态阈值分层(避免手工调参)
如果每月用户分布变化大,固定数值容易失效。可先用子查询算出当月分位点,再JOIN匹配:
WITH quantiles AS (
SELECT
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY total_amount) AS q1,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) AS q2,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY total_amount) AS q3
FROM (
SELECT user_id, COALESCE(SUM(price), 0) AS total_amount
FROM orders
GROUP BY user_id
) t
),
user_metrics AS (
SELECT user_id, COALESCE(SUM(price), 0) AS total_amount
FROM orders
GROUP BY user_id
)
SELECT
u.user_id,
u.total_amount,
CASE
WHEN u.total_amount >= q3 THEN '高价值'
WHEN u.total_amount >= q2 THEN '中价值'
WHEN u.total_amount >= q1 THEN '潜力用户'
ELSE '新/低频'
END AS dynamic_level
FROM user_metrics u
CROSS JOIN quantiles;
四、分层结果复用与更新建议
分层不是一次性动作,要支持定期刷新和下游调用:
- 把核心指标(如最近下单天数、总金额、订单数)单独建宽表,每日增量更新
- 等级字段建议存在用户维表中,用定时任务(如每天凌晨)重算,避免每次分析都重复聚合
- 对外提供视图时,加上 level_updated_at 字段,让业务方知道数据时效性










