SELECT user_id, COUNT(date_rk_diff) days_count FROM
 (SELECT user_id, DATE_SUB(sales_date, INTERVAL rk DAY) date_rk_diff FROM
  (SELECT user_id, sales_date, ROW_NUMBER() over (partition by user_id order by sales_date) rk
   FROM (SELECT DISTINCT user_id, sales_date FROM sales_tb) t1
   ) t2
  ) t3
GROUP BY user_id
HAVING days_count>=2
ORDER BY user_id

参考了讨论区牛客532504983号的思路。

第一,sales_date日期去重,去除一天多次购买的情况。

第二,row_number()按用户分组对日期排序,记为rk,意为每个用户登录的第rk天。

第三,如果日期记录连续的话,每个用户的sales_date减去rk会得到一样的日期。这是关键。

比如 2021-1-1减去1得2020-12-31,

2021-1-2减去2得2020-12-31,

2021-1-3减去3得2020-12-31,

2021-1-5减去4得2021-01-01,

2021-1-6减去5得2021-01-01。

根据减出来的日期,2020-12-31和2021-01-01,进行分组计数(当然GROUP BY user_id也是不能漏的),就可以知道连续登录的天数。

上面代码可以写得更简略一点,直接用user_id, DATE_SUB(sales_date, INTERVAL rk DAY)来分组

SELECT user_id, COUNT(*) days_count FROM 
(SELECT user_id, sales_date, ROW_NUMBER() over (partition by user_id order by sales_date) rk
 FROM (SELECT DISTINCT user_id, sales_date FROM sales_tb) t1
 ) t2
GROUP BY user_id, DATE_SUB(sales_date, INTERVAL rk DAY)
HAVING days_count>=2
ORDER BY user_id