【场景】:次日留存率

【分类】:多表连接date_sub(dt, interval 1 day) = min_dt

分析思路

难点:

1.怎么得到第一天的用户第二天又有浏览记录

2.第一天有用户活跃第二天没有用户活跃,怎么处理?使用 right join ,以有新增用户的表为主。

新学到:

1.统计2021年11月: like '2021-11%'

(1)统计用户的活跃记录

如果in_time-进入时间out_time-离开时间跨天了,在两天里都记为该用户活跃过

  • [使用]:union把两个活跃记录合并

(2)统计新增用户

新用户:第一次进入时间(天)就是该时间新增用户

  • [使用]:min() group uid

(3)统计2021年11月每天新用户的次日留存率

怎么得到今天的用户第二天又有浏览记录:让用户ID相等并且‘一个日期减去一天是另一个日期’或者‘一个日期加上一天是另外一个日期’

  • [使用]:date_add(min_dt, interval 1 day) = dt;或者使用 date_sub(dt, interval 1 day) = min_dt

求解代码

方法一:

with 子句 + 一步步拆解

with
    main as(
        #统计用户的活跃记录
        (select
            uid,
            date(in_time) as dt
        from tb_user_log)
        union
        (select
            uid,
            date(out_time) as dt
        from tb_user_log)
    ),
    attr as(
        #统计新增用户
        select
            uid,
            min(date(in_time)) as min_dt
        from tb_user_log
        group by uid
    ),
    attr1 as(
        #统计每天新增用户数
        select
            min_dt,
            count(uid) as cnt_one_activity
        from attr
        group by min_dt
    )
    ,attr2 as(
        #统计今天活跃第二天又活跃的用户个数
        select
            attr.min_dt,
            count(*) as cnt_two_activity
        from main
        join attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
        group by min_dt
    )

#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
    min_dt as dt,
    round(ifnull(cnt_two_activity/cnt_one_activity,0),2) as uv_left_rate
from attr1
left join attr2 using(min_dt)
where min_dt like '2021-11%'
order by dt

方法二:

使用 year(in_time) = 2021 and month(in_time) = 11;date_add

with
    main as(
        #统计用户的活跃记录
        (select
            uid,
            date(in_time) as dt
        from tb_user_log
        where year(in_time) = 2021
        and month(in_time) = 11)
        union
        (select
            uid,
            date(out_time) as dt
        from tb_user_log
        where year(out_time) = 2021
        and month(out_time) = 11)
    ),
    attr as(
        #统计新增用户
        select
            uid,
            min(date(in_time)) as min_dt
        from tb_user_log
        where year(in_time) = 2021
        and month(in_time) = 11
        group by uid
    )

#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
    min_dt as dt,
    round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from main
right join attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
group by min_dt
order by min_dt

方法三:

使用 like '2021-11%';date_add

with
    main as(
        #统计用户的活跃记录
        (select
            uid,
            date(in_time) as dt
        from tb_user_log)
        union
        (select
            uid,
            date(out_time) as dt
        from tb_user_log)
    ),
    attr as(
        #统计新增用户
        select
            uid,
            min(date(in_time)) as min_dt
        from tb_user_log
        group by uid
    )

#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
    min_dt as dt,
    round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from main
right join attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
where min_dt like '2021-11%'
group by min_dt
order by min_dt

方法四:

使用 like '2021-11%';date_sub

with
    main as(
        #统计用户的活跃记录
        (select
            uid,
            date(in_time) as dt
        from tb_user_log)
        union
        (select
            uid,
            date(out_time) as dt
        from tb_user_log)
    ),
    attr as(
        #统计新增用户
        select
            uid,
            min(date(in_time)) as min_dt
        from tb_user_log
        group by uid
    )

#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
    min_dt as dt,
    round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from main
right join attr on main.uid = attr.uid and date_sub(dt, interval 1 day) = min_dt
where min_dt like '2021-11%'
group by min_dt
order by min_dt

方法五:

多表连接

#统计2021年11月每天新用户的次日留存率,(保留2位小数)
select
    min_dt as dt,
    round(ifnull(count(distinct main.uid)/count(attr.uid),0),2) as uv_left_rate
from(
    #统计用户的活跃记录
    (select
        uid,
        date(in_time) as dt
    from tb_user_log)
    union
    (select
        uid,
        date(out_time) as dt
    from tb_user_log)
) main
right join(
    #统计新增用户
    select
        uid,
        min(date(in_time)) as min_dt
    from tb_user_log
    group by uid
) attr on main.uid = attr.uid and date_add(min_dt, interval 1 day) = dt
where min_dt like '2021-11%'
group by min_dt
order by min_dt