这里先贴个不用窗口函数的常规思路的解法:
首先是in_time, out_time跨天都计为活跃的问题, 代码如下, 用union合并, 这样in_time和out_time两列转一列, 在根据日期分组, count(distinct t1.uid) 得到每日dau
select date(t1.active) as active,
count(distinct t1.uid) as dau
from (
select ul.in_time as active,
ul.uid
from tb_user_log ul
union
select ul.out_time as active,
ul.uid
from tb_user_log ul
) t1
group by date(t1.active)其次是每日新增uv计算, 代码如下, 先根据uid分组, 求得min(date(in_time)) 得到初始登录日期, 外层在根据min_date分组, 对uid进行去重统计, 得到每天得新增用户数.
select min_date,
count(distinct t1.uid) as new_uv
from (
select ul.uid,
min(date(ul.in_time)) as min_date
from tb_user_log ul
group by ul.uid
) t1
group by min_date最后, 我们已经获取了每日得新增用户数, 每日活跃用户数, 那么只需根据日期关联, 即可求得答案
注意点: 左连接, 当没有新增活跃用户时, 为null, 需要置为0
select t2.active,
t2.dau,
round(ifnull(t3.new_uv, 0)/t2.dau, 2) as uv_new_ratio
from (
select date(t1.active) as active,
count(distinct t1.uid) as dau
from (
select ul.in_time as active,
ul.uid
from tb_user_log ul
union
select ul.out_time as active,
ul.uid
from tb_user_log ul
) t1
group by date(t1.active)
) t2
left join (
select min_date,
count(distinct t1.uid) as new_uv
from (
select ul.uid,
min(date(ul.in_time)) as min_date
from tb_user_log ul
group by ul.uid
) t1
group by min_date
) t3 on t2.active=t3.min_date
order by t2.active接着是使用窗口函数求解
窗口函数: count() over(partition by uid order by time) 根据用户分组, 登录时间进行排序, 然后计数, 会得到一列, 当用户为第一次登录时, count() 值为1, 然后依次累加, 这里也可以用使用: min(in_time) over(partition by uid) as min_login, 再将该列和正常时间比较, in_time = min_login就是新用户, 甚至可以用rank() over(partition by uid order by time) 都行.
小声吐槽, 我们公司用的mysql5.7版本的, 都不支持窗口函数
select t2.mau_date,
count(distinct uid) as dau,
round(sum(if(times=1,1,0))/count(uid), 2)
from (
select *,
count(*) over (partition by uid order by t1.mau_date) as times
from (
select distinct uid,date(in_time) as mau_date
from tb_user_log
union
select distinct uid,date(out_time) as mau_date
from tb_user_log
) t1
) t2
GROUP by t2.mau_date
order by t2.mau_date


京公网安备 11010502036488号