时间:37ms 内存:6384KB
本题需求不难理解,难点在于如何计算瞬时的最大计数(在看人数)
首先,我们自然会想到常见的编码+联立。在此对原表in_time和out_time进行编码,in为观看人数+1, out为观看人数-1,进行两次SELECT联立,并按artical_id升序,时间戳升序:
代码:
SELECT
artical_id, in_time dt, 1 diff
FROM tb_user_log
WHERE artical_id != 0
UNION ALL
SELECT
artical_id, out_time dt, -1 diff
FROM tb_user_log
WHERE artical_id != 0
ORDER BY 1,2
结果:
9001|2021-11-01 10:00:00|1
9001|2021-11-01 10:00:01|1
9001|2021-11-01 10:00:09|1
9001|2021-11-01 10:00:11|-1
9001|2021-11-01 10:00:28|1
9001|2021-11-01 10:00:38|-1
9001|2021-11-01 10:00:51|1
9001|2021-11-01 10:00:58|-1
9001|2021-11-01 10:00:59|-1
9001|2021-11-01 10:01:50|-1
9002|2021-11-01 11:00:45|1
9002|2021-11-01 11:00:55|1
9002|2021-11-01 11:01:11|-1
9002|2021-11-01 11:01:24|-1
意义:
某篇文章artical_id,在给定的时间戳dt的,瞬时观看人数变化diff
到这一步,本题的解法基本就明朗了:
我们考虑使用SUM窗口函数,按文章id维度,统计按时间戳升序的观看人数变化情况:
代码:
SELECT
artical_id,
dt,
SUM(diff) OVER(PARTITION BY artical_id ORDER BY dt) instant_viewer_cnt
FROM (
SELECT
artical_id, in_time dt, 1 diff
FROM tb_user_log
WHERE artical_id != 0
UNION ALL
SELECT
artical_id, out_time dt, -1 diff
FROM tb_user_log
WHERE artical_id != 0) t1
结果:
9001|2021-11-01 10:00:00|1
9001|2021-11-01 10:00:01|2
9001|2021-11-01 10:00:09|3
9001|2021-11-01 10:00:11|2
9001|2021-11-01 10:00:28|3
9001|2021-11-01 10:00:38|2
9001|2021-11-01 10:00:51|3
9001|2021-11-01 10:00:58|2
9001|2021-11-01 10:00:59|1
9001|2021-11-01 10:01:50|0
9002|2021-11-01 11:00:45|1
9002|2021-11-01 11:00:55|2
9002|2021-11-01 11:01:11|1
9002|2021-11-01 11:01:24|0
意义:
某篇文章artical_id,在给定的时间戳dt的,瞬时累计观看人数instant_viewer_cnt
然后到了本题的坑点。
题目要求在瞬时统计时遵循【先进后出】:如果同一时刻有进入也有离开时,先记录用户数增加,再记录减少。
因此在ORDER BY层面,在遵循dt升序的同时,还要遵循先+1,再-1的原则,即diff DESC:
SUM(diff) OVER(PARTITION BY artical_id ORDER BY dt, diff DESC)
——————————————————————————————————
步骤拆解:
- 对原表编码并联立;
- 按artical_id维度,dt升序 ,diff降序,对diff进行SUM开窗统计,得到每个artical_id的瞬时观看人数instant_viewer_cnt;
- 最外层SELECT按artical_id聚合,通过MAX(instant_viewer_cnt)取出瞬时观看最大值max_uv,并排序。
——————————————————————————————————
完整代码如下:
SELECT
artical_id,
MAX(instant_viewer_cnt) max_uv
FROM (
SELECT
artical_id,
SUM(diff) OVER(PARTITION BY artical_id ORDER BY dt, diff DESC) instant_viewer_cnt
FROM (
SELECT
artical_id, in_time dt, 1 diff
FROM tb_user_log
WHERE artical_id != 0
UNION ALL
SELECT
artical_id, out_time dt, -1 diff
FROM tb_user_log
WHERE artical_id != 0) t1
) t2
GROUP BY 1
ORDER BY 2 DESC
本解法思路来源于@大大BIG_GUM的代码,在此表示感谢!