【场景】:次日留存率
【分类】:多表连接、date_sub(dt, interval 1 day) = min_dt
分析思路
难点:
1.怎么得到第一天的用户第二天又有浏览记录
2.第一天有用户活跃第二天没有用户活跃,怎么处理?使用 right join ,以有新增用户的表为主。
新学到:
1.统计2021年11月: like '2021-11%'
(1)统计用户的活跃记录
如果in_time-进入时间和out_time-离开时间跨天了,在两天里都记为该用户活跃过
- [使用]:union把两个活跃记录合并
(2)统计新增用户
新用户:第一次进入时间(天)就是该时间新增用户
- [使用]:min() group uid
(3)统计2021年11月每天新用户的次日留存率
怎么得到今天的用户第二天又有浏览记录:让用户ID相等并且‘一个日期减去一天是另一个日期’或者‘一个日期加上一天是另外一个日期’
- [使用]:date_add(min_dt, interval 1 day) = dt;或者使用 date_sub(dt, interval 1 day) = min_dt
求解代码
方法一:
with 子句 + 一步步拆解
with
main as(
#统计用户的活跃记录
(select
uid,
date(in_time) as dt
from tb_user_log)
union
(select
uid,
date(out_time) as dt
from tb_user_log)
),
attr as(
#统计新增用户
select
uid,
min(date(in_time)) as min_dt
from tb_user_log
group by uid
),
attr1 as(
#统计每天新增用户数
select
min_dt,
count(uid) as cnt_one_activity
from attr
group by min_dt
)
,attr2 as(
#统计今天活跃第二天又活跃的用户个数
select
attr.min_dt,
count(*) as cnt_two_activity
from main
join attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
group by min_dt
)
#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
min_dt as dt,
round(ifnull(cnt_two_activity/cnt_one_activity,0),2) as uv_left_rate
from attr1
left join attr2 using(min_dt)
where min_dt like '2021-11%'
order by dt
方法二:
使用 year(in_time) = 2021 and month(in_time) = 11;date_add
with
main as(
#统计用户的活跃记录
(select
uid,
date(in_time) as dt
from tb_user_log
where year(in_time) = 2021
and month(in_time) = 11)
union
(select
uid,
date(out_time) as dt
from tb_user_log
where year(out_time) = 2021
and month(out_time) = 11)
),
attr as(
#统计新增用户
select
uid,
min(date(in_time)) as min_dt
from tb_user_log
where year(in_time) = 2021
and month(in_time) = 11
group by uid
)
#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
min_dt as dt,
round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from main
right join attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
group by min_dt
order by min_dt
方法三:
使用 like '2021-11%';date_add
with
main as(
#统计用户的活跃记录
(select
uid,
date(in_time) as dt
from tb_user_log)
union
(select
uid,
date(out_time) as dt
from tb_user_log)
),
attr as(
#统计新增用户
select
uid,
min(date(in_time)) as min_dt
from tb_user_log
group by uid
)
#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
min_dt as dt,
round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from main
right join attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
where min_dt like '2021-11%'
group by min_dt
order by min_dt
方法四:
使用 like '2021-11%';date_sub
with
main as(
#统计用户的活跃记录
(select
uid,
date(in_time) as dt
from tb_user_log)
union
(select
uid,
date(out_time) as dt
from tb_user_log)
),
attr as(
#统计新增用户
select
uid,
min(date(in_time)) as min_dt
from tb_user_log
group by uid
)
#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
min_dt as dt,
round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from main
right join attr on main.uid = attr.uid and date_sub(dt, interval 1 day) = min_dt
where min_dt like '2021-11%'
group by min_dt
order by min_dt
方法五:
多表连接
#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
min_dt as dt,
round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from(
#统计用户的活跃记录
(select
uid,
date(in_time) as dt
from tb_user_log)
union
(select
uid,
date(out_time) as dt
from tb_user_log)
) main
right join(
#统计新增用户
select
uid,
min(date(in_time)) as min_dt
from tb_user_log
group by uid
) attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
where min_dt like '2021-11%'
group by min_dt
order by min_dt