看了一眼,排在前面的答案都没有考虑这样的情况:客户连续入住,但可能连续入住的是不同房间。比如某客户2022年6月12日入住1001房间,觉得房间不合适,跟前台说换房,比如换到1002房间,住到6月13日退房,接着6月13日又入住1003房间,6月14日退房。 这里含有2个信息:

  • 连续入住,但住的是不同房间,在表里为不同的记录,但用户可以算作连续入住多晚;
  • 如果连续入住的是不同房间,那结果表里房间id应该怎么展示呢?是展示住过的所有房间id吗?还是展示最初入住的房间id?还是…… 题目中并没有明确,或者说给出的示例数据并没有考虑这种情况。 那么,如果考虑用户连续入住不同房间的情况下,假设结果展示最初入住的房间id。
  1. 需求:
  • 计算每位客户在每个房间的连续入住天数
  • 筛选出连续入住天数>1的客户
  • 按照要求排序输出
  1. 识别连续入住逻辑:
  • 使用窗口函数检查客户每次退房和下个入住之间是否有间隔,即使用 DATEDIFF(checkin_date, LAG(checkout_date)) > 0 判断是否有间隔
  • 如果没有间隔或间隔小于1天,则视为连续入住:
    • 同一天换房(差0天→连续)
    • 间隔天换房(差>=1天→新组)
  1. 计算连续入住天数:
  • 对客户的入住记录进行分组,连续入住的记录归为同一组
  • 计算每组的入住天数作为连续入住天数

代码如下:

WITH ordered_stays AS (
SELECT 
    user_id,
    room_id,
    checkin_time,
    checkout_time,
    DATE(checkin_time) AS checkin_date,
    DATE(checkout_time) AS checkout_date,
    ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY checkin_time) AS stay_seq
FROM 
    checkin_tb
WHERE 
    DATE(checkin_time) >= '2022-06-12'
),
flagged_consec_stays AS (
SELECT 
    user_id,
    room_id,
    checkin_date,
    checkout_date,
    stay_seq,
    CASE WHEN DATEDIFF(checkin_date, LAG(checkout_date) OVER (PARTITION BY user_id ORDER BY stay_seq)) > 0 
        THEN 1 -- 新的连续入住组
        ELSE 0 -- 与前一次入住连续
    END AS new_stay_group
FROM 
    ordered_stays
),
stay_groups as(
SELECT 
    user_id,
    room_id,
    checkin_date,
    checkout_date,
    stay_seq,
    SUM(new_stay_group) OVER (PARTITION BY user_id ORDER BY stay_seq) AS stay_group_id
FROM flagged_consec_stays 
),
group_first_rooms AS (
SELECT 
    user_id,
    -- 取组内checkin_date最早的room_id(最初入住的房间)
    FIRST_VALUE(room_id) OVER (
        PARTITION BY user_id, stay_group_id 
        ORDER BY checkin_date
    ) AS first_room_id,
    MIN(checkin_date) OVER (PARTITION BY user_id, stay_group_id) AS start_date,
    MAX(checkout_date) OVER (PARTITION BY user_id, stay_group_id) AS end_date
FROM 
    stay_groups
),
continuous_stays AS (
SELECT DISTINCT
    user_id,
    first_room_id,
    start_date,
    end_date,
    DATEDIFF(end_date, start_date) AS consec_nights
FROM 
    group_first_rooms
WHERE 
    DATEDIFF(end_date, start_date) >= 1
)
SELECT 
    cs.user_id,
    cs.first_room_id AS room_id,
    gr.room_type,
    cs.consec_nights AS days
FROM 
    continuous_stays cs
JOIN 
    guestroom_tb gr ON cs.first_room_id = gr.room_id
WHERE 
    cs.consec_nights > 1
ORDER BY 
    cs.consec_nights ASC,
    cs.first_room_id ASC,
    cs.user_id DESC
;