解题关键:新用户=登录日期date 等于 按user_id分组中的MIN(date)
方案1:GROUP BY分组+CASE表达式嵌套新用户的过滤+SUM函数求和得新用户数量
第一步:用户最早登录的记录作为后续新用户的过滤条件
SELECT user_id,MIN(date) FROM login GROUP BY user_id;
第二步:将第一步得过滤条件嵌套到CASE表达式中作为新登录用户字段new
SELECT l1.date,SUM(CASE WHEN (l1.user_id,l1.date) IN (SELECT user_id,MIN(date) FROM login GROUP BY user_id) THEN 1 else 0 END) FROM login l1 GROUP BY l1.date ORDER BY l1.date ASC;
方案2:借鉴大佬@SunburstRun的题解
https://blog.nowcoder.net/n/a2f955514f824bb888f9d7726421e809?f=comment
第一步:算出每个日期的登录用户数
SELECT l1.date,COUNT(DISTINCT l1.user_id) FROM login l1 GROUP BY l1.date;
第二步:加入过滤条件提取每日新登录用户(新登录用户 =user_id相同,date等于MIN(date)
SELECT l1.date,COUNT(DISTINCT l1.user_id) AS new FROM login l1 WHERE l1.date = (SELECT MIN(date) FROM login WHERE user_id = l1.user_id GROUP BY user_id) GROUP BY l1.date;
(这种方法没有显示新用户0的情况,因为被过滤了)
第三步:采用主表login联结第三步(临时表t),注意最后的date需要去重,添加前缀DISTINCT,或者GROUP BY语句都可以
SELECT DISTINCT l.date,IFNULL(t.new_num ,0) FROM login l LEFT JOIN (SELECT l1.date,COUNT(DISTINCT l1.user_id) AS new_num FROM login l1 WHERE l1.date = (SELECT MIN(date) FROM login WHERE user_id = l1.user_id GROUP BY user_id) GROUP BY l1.date ) t ON l.date = t.date ORDER BY l.date ASC;
方案3:ROW_NUMBER窗口函数为user_id登录日期排序+SUM函数
第一步:为用户登录次数排序
SELECT date,ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY date ASC) AS date_rank FROM login;
第二步:将第一步作为表,添加SUM函数+CASE表达式作为过滤求和得出每个日期的新用户数目
(date_rank = 1的用户在对应date中是新用户)
SELECT t.date,SUM(CASE WHEN t.date_rank = 1 THEN 1 ELSE 0 END) AS new FROM (SELECT date,ROW_NUMBER() OVER(PARTITION BY user_id ORDER BY date ASC) AS date_rank FROM login) t GROUP BY t.date ORDER BY t.date ASC;
方案1和方案3原理是相同的,都是case+sum作为过滤求和的条件计算新登录用户