一步一步来吧。
确定每个用户最早出现的日期,即每个用户作为新用户时的日期
select user_id, min(date) as first_date from login group by user_id
基于step1结果确定每个新用户次日留存了的记录。用right join留存了则保留次日日期,未留存则为None
select b.user_id, b.first_date, a.date from login as a right join ( select user_id, min(date) as first_date from login group by user_id ) as b on a.user_id=b.user_id and datediff(a.date, b.first_date)=1
- 从step2结果中计算有新用户出现的每天的新用户次日留存率。
select new_user.first_date as `date`, round(count(new_user.date) / count(new_user.user_id), 3) as p from ( select b.user_id, b.first_date, a.date from login as a right join ( select user_id, min(date) as first_date from login group by user_id ) as b on a.user_id=b.user_id and datediff(a.date, b.first_date)=1 ) as new_user group by new_user.first_date
- 基于step3结果补全未出现新用户的日期的次日留存(补0就好),并别忘了按日期排序
select distinct unique_date.date, ifnull(new_user_p.p, 0) from login as unique_date left join ( select new_user.first_date as `date`, round(count(new_user.date)/count(new_user.user_id), 3) as p from ( select b.user_id, b.first_date, a.date from login as a right join ( select user_id, min(date) as first_date from login group by user_id ) as b on a.user_id=b.user_id and datediff(a.date, b.first_date)=1 ) as new_user group by new_user.first_date ) as new_user_p on unique_date.date=new_user_p.date order by unique_date.date
至此结束~