思路: 我的思路是通过“今日与最晚活跃日期时间差”和“今日与最早活跃日期时间差”共同来确定用户的等级,所以首先生成一个表a,主要包含三个字段:uid,今日与最晚活跃日期时间差,今日与最早活跃日期时间差:
select uid,
datediff((select max(out_time) from tb_user_log),max(out_time)) as 今日与最后活跃日时间差,
datediff((select max(out_time) from tb_user_log),min(in_time)) as 今日与最早活跃日时间差
from tb_user_log
group by uid;
运行结果如下: 然后,判断的逻辑如下:如果今日与最后活跃日时间差大于29天(包含今日在内),那么必然是流失用户;如果今日与最后活跃日时间差小于等于29天但大于等于7天,那么必然是沉睡用户;如果今日与最后活跃日时间小于等于6天,则如果“今日与最晚活跃日期时间差”和“今日与最早活跃日期时间差”相等说明是新晋用户,不相等说明是忠实用户,因此从之前建立的表a中可以提取设置新变量用户等级:
select uid,
case when 今日与最后活跃日时间差 > 29 then '流失用户'
when 今日与最后活跃日时间差 <= 29 and 今日与最后活跃日时间差 >= 7 then '沉睡用户'
when 今日与最后活跃日时间差 <= 6 and 今日与最后活跃日时间差 = 今日与最早活跃日时间差 then '新晋用户'
else '活跃用户' end as 用户分级
from
(select uid,
datediff((select max(out_time) from tb_user_log),max(out_time)) as 今日与最后活跃日时间差,
datediff((select max(out_time) from tb_user_log),min(in_time)) as 今日与最早活跃日时间差
from tb_user_log
group by uid) as a
结果如下: 最后我们从上表中提取比例:
select 用户分级,
round(count(uid)/(select count(distinct uid) from tb_user_log),2) as 比例
from
(select uid,
case when 今日与最后活跃日时间差 > 29 then '流失用户'
when 今日与最后活跃日时间差 <= 29 and 今日与最后活跃日时间差 >= 7 then '沉睡用户'
when 今日与最后活跃日时间差 <= 6 and 今日与最后活跃日时间差 = 今日与最早活跃日时间差 then '新晋用户'
else '忠实用户' end as 用户分级
from
(select uid,
datediff((select max(out_time) from tb_user_log),max(out_time)) as 今日与最后活跃日时间差,
datediff((select max(out_time) from tb_user_log),min(in_time)) as 今日与最早活跃日时间差
from tb_user_log
group by uid) as a) as b
group by 用户分级
order by 比例 desc;