此题共包含两张表:
表1:用户-视频互动表tb_user_video_log
表2:短视频信息表tb_video_info
要解决的问题:
统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量,结果按视频类别降序、日期升序排序。假设数据库中数据足够多,至少每个类别下国庆头3天及之前一周的每天都有播放记录。
解题思路:
- 2021年国庆前三天,播放视频的类别 select...from...where dt in ('2021-10-01','2021-10-02','2021-10-03')
- 每类视频近每天点赞量和转发量 sum(if_like) sum_like_cnt, sum(if_retweet) sum_if_retweet_cnt
- 以2021.10.1、2021.10.2、2021.10.3延后7天,分别统计总的点赞量和最大转发量
sum() over(PARTITION by tag ORDER BY dt rows BETWEEN 6 preceding and current row) - 按视频类别降序、日期升序排序 order BY tag desc, dt
知识点
- 窗口函数,往常使用窗口函数只用到了前两个参数分组和排序,以至于刚看到题的时候忽略了窗口函数可以取计算范围,浪费了不少代码。计算范围通常写做rows between... preceding and...row
- 取当前行和前面两行rows between 2 preceding and current row
- 取当前行和后面两行rows between current row and 1 following
- 取当前行及之后所有行rows between current row and unbounded following
- 取当前行及之前所有行 rows between unbounded preceding and current row
- 取当前行的前三行及后两行rows between 3 preceding and 2 following
select *
from(
select tag, dt,
sum(sum_like_cnt) over(PARTITION by tag ORDER BY dt rows BETWEEN 6 preceding and current row) sum_like_cnt_7d,
max(sum_if_retweet_cnt) over(PARTITION by tag ORDER BY dt rows BETWEEN 6 preceding and current row) sum_retweet_cnt_7d
from(
select tag, date(start_time) dt, sum(if_like) sum_like_cnt, sum(if_retweet) sum_if_retweet_cnt
from tb_user_video_log a
left join tb_video_info b
on a.video_id = b.video_id
group by tag, dt
) t1
) t2
where dt in ('2021-10-01','2021-10-02','2021-10-03')
order BY tag desc, dt