with t0 as(
select
uid,date(in_time) dt
from tb_user_log
union
select
uid,date(out_time) dt
from tb_user_log
),
t1 as(
select
*,min(dt)over(partition by uid order by dt) first_day
from t0
)
select
dt
,count(1) dau
,round(sum(if(first_day=dt,1,0))/count(1),2) ration
from t1
group by 1
order by 1
题目
统计每天的日活数及新用户占比
一、拆解题目
1.求什么
每天:以天聚合
日活数:每天活跃人数:date,count(1),以group by date
新用户占比:每日,活跃的新用户人数/当日总活跃人数
活跃的新用户人数:2种方法如下
(1)明细表基础上
① 加一列,Min()开窗,标出每个用户首日活跃日期。
②如果(i)统计日期date,和首日相等,在那一天,ta就是新用户。
③合计sum(if(统计日期=首日活跃,即为新用户,记1,否则都记0),以此求出新用户数量
(2)聚合,分别求出新用户数量,再和聚合的总日活left join
基本思路还是:date,count(1),以group by date
加一个人群限定:where (uid,date) in 子查询(select uid, min(date)first_day from group by 1)
2.输出什么
date,dau,ratio
二、步骤
数据预处理
本题跨天,均记活跃,所以需要union,把in_time和out_time并在一列,并去重
with t0 as(
select
uid,date(in_time) dt
from tb_user_log
union
select
uid,date(out_time) dt
from tb_user_log
),
方法1:标记新用户首日,sum(if)统计新用户,count(1)统计全部活跃用户
① 加一列,Min()开窗,标出每个用户首日活跃日期。
②如果(if)统计日期date,和首日相等,在那一天,ta就是新用户。
③合计sum(if(统计日期=首日活跃,即为新用户,记1,否则都记0),以此求出新用户数量
t1 as(
select
*,min(dt)over(partition by uid order by dt) first_day
from t0
)
select
dt
,count(1) dau
,round(sum(if(first_day=dt,1,0))/count(1),2) uv_new_ratio
from t1
group by 1
order by 1
方法2:分别聚合,再左连接
①t1:新用户人数,用where in 实现新用户筛选,用dt,count(1)实现人数统计
②每日活跃人数:用from子查询实现——为什么不直接连原表t0。因为t0和t1,一个汇总一个明细,dt连接,1对多,会出现笛卡尔积。
③left join t1
④计算,这种除法,注意处理一下分子分母为0的情况:ifnull
代码
t1 as(#新用户人数
select
dt,count(1) new
from t0
where (uid,dt) in(select uid,min(dt) first_day from t0
group by 1)
group by 1)
select
dt
,dau
,round(ifnull(new/dau,0),2) uv_new_ratio
from (select dt,count(1) dau from t0
group by 1) a
left join t1 using(dt)
order by 1
三、完整代码
方法1
with t0 as(
select
uid,date(in_time) dt
from tb_user_log
union
select
uid,date(out_time) dt
from tb_user_log
),
t1 as(
select
*,min(dt)over(partition by uid order by dt) first_day
from t0)
select
dt
,count(1) dau
,round(sum(if(first_day=dt,1,0))/count(1),2) uv_new_ratio
from t1
group by 1
order by 1
方法2
with t0 as(
select
uid,date(in_time) dt
from tb_user_log
union
select
uid,date(out_time) dt
from tb_user_log
),
t1 as(#新用户人数
select
dt,count(1) new
from t0
where (uid,dt) in(select uid,min(dt) first_day from t0
group by 1)
group by 1)
select
dt
,dau
,round(ifnull(new/dau,0),2) uv_new_ratio
from (select dt,count(1) dau from t0
group by 1) a
left join t1 using(dt)
order by 1



京公网安备 11010502036488号