SELECT 
ROUND(COUNT(DISTINCT user_id)/(SELECT COUNT(DISTINCT user_id) FROM login),3)
FROM login
WHERE (user_id,date) IN (
    SELECT user_id,DATE_ADD(MIN(date),INTERVAL 1 DAY) FROM login
    GROUP BY user_id);

一开始的思路: 公式:连续两天登陆的用户/总用户=留存率 首先子查询找出连续两天登陆的用户数,利用两表联查,条件是两表user_id相等,登陆天数连续。

(SELECT COUNT(DISTINCT l1.user_id) success 
    FROM login l1, login l2 
    WHERE l1.user_id=l2.user_id AND TIMESTAMPDIFF(DAY,l1.date,l2.date)=1) a;

然后按照公式算出留存率:

SELECT round(a.success/COUNT(DISTINCT l.user_id),3) p
FROM login l,
(SELECT COUNT(DISTINCT l1.user_id) success 
    FROM login l1, login l2 
    WHERE l1.user_id=l2.user_id AND TIMESTAMPDIFF(DAY,l1.date,l2.date)=1) a;

错误分析:虽然答案正确,但这不是新用户的登陆留存率,公式应该改正为:新用户第二天仍然登录的人数/总用户数,也就是说应该是计算出每组user_id中第一天登录了第二天仍然登录的人数。这里要先找出第一天登录的日期,然后用DATE_ADD(MIN(date,INTERVAL 1 DAY)找到第二天日期,可以不用两表联查而是直接查询计数第二天在表里的人数,利用where子句条件筛选出符合条件的user_id和date:

WHERE (user_id,date) IN (
    SELECT user_id,DATE_ADD(MIN(date),INTERVAL 1 DAY) FROM login
    GROUP BY user_id);

改正后代码为:

SELECT 
ROUND(COUNT(DISTINCT user_id)/(SELECT COUNT(DISTINCT user_id) FROM login),3)
FROM login
WHERE (user_id,date) IN (
    SELECT user_id,DATE_ADD(MIN(date),INTERVAL 1 DAY) FROM login
    GROUP BY user_id);