1. 确定总体问题
我们需要找出在注册后连续登录不少于3天的用户,并输出他们的用户ID,并按用户ID升序排序。
2. 分析关键问题
- 提取登录日期:从登录信息中提取每个用户的登录日期。
- 编号登录记录:为每个用户的登录日期编号,以便识别连续登录的天数。
- 识别连续登录:通过计算日期差异识别连续登录的天数。
- 过滤和排序:筛选出连续登录不少于3天的用户,并按用户ID升序排序。
3. 解决每个关键问题的代码及讲解
步骤1:提取登录日期
我们使用WITH子句创建一个临时表login_dates,提取每个用户的登录日期:
WITH login_dates AS (
SELECT DISTINCT user_id, DATE(log_time) AS login_date
FROM login_tb
WHERE user_id IN (SELECT user_id FROM register_tb)
)
SELECT DISTINCT user_id, DATE(log_time) AS login_date:提取每个用户的唯一登录日期,DISTINCT去重,DATE用于提取日期。WHERE user_id IN (SELECT user_id FROM register_tb):只选择注册用户的登录记录。
步骤2:编号登录记录
我们为每个用户的登录日期编号,以便识别连续登录的天数:
numbered_logins AS (
SELECT user_id, login_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date) AS rn
FROM login_dates
)
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date) AS rn:为每个用户的登录日期按时间顺序编号。PARTITION BY user_id按照user_id分区ORDER BY login_date按照login_date排序
步骤3:识别连续登录
通过计算日期差异识别连续登录的天数:
grouped_logins AS (
SELECT user_id, COUNT(*) AS consecutive_days
FROM numbered_logins
GROUP BY user_id, DATE_SUB(login_date, INTERVAL rn DAY)
)
DATE_SUB(login_date, INTERVAL rn DAY):通过减去编号,连续的日期将具有相同的结果,从而可以分组。COUNT(*) AS consecutive_days:计算每个分组的连续天数。
DATE_SUB处理前后样例如下
| 2023-01-02 | 1 | → | 2023-01-01 |
| 2023-01-03 | 2 | → | 2023-01-01 |
| 2023-01-05 | 3 | → | 2023-01-02 |
步骤4:过滤和排序
筛选出连续登录不少于3天的用户,并按用户ID升序排序:
SELECT user_id
FROM grouped_logins
WHERE consecutive_days >= 3
ORDER BY user_id;
WHERE consecutive_days >= 3:筛选出连续登录不少于3天的用户。ORDER BY user_id:按用户ID升序排序。
完整代码
WITH login_dates AS (
SELECT DISTINCT user_id, DATE(log_time) AS login_date
FROM login_tb
WHERE user_id IN (SELECT user_id FROM register_tb)
),
numbered_logins AS (
SELECT user_id, login_date,
ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date) AS rn
FROM login_dates
),
grouped_logins AS (
SELECT user_id, COUNT(*) AS consecutive_days
FROM numbered_logins
GROUP BY user_id, DATE_SUB(login_date, INTERVAL rn DAY)
)
SELECT user_id
FROM grouped_logins
WHERE consecutive_days >= 3
ORDER BY user_id;

京公网安备 11010502036488号