此题共包含两张表:

表1:用户-视频互动表tb_user_video_log

表2:短视频信息表tb_video_info

要解决的问题:

统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量,结果按视频类别降序、日期升序排序。假设数据库中数据足够多,至少每个类别下国庆头3天及之前一周的每天都有播放记录。

解题思路:

  1. 2021年国庆前三天,播放视频的类别 select...from...where dt in ('2021-10-01','2021-10-02','2021-10-03')
  2. 每类视频近每天点赞量和转发量 sum(if_like) sum_like_cnt, sum(if_retweet) sum_if_retweet_cnt
  3. 以2021.10.1、2021.10.2、2021.10.3延后7天,分别统计总的点赞量和最大转发量
    sum() over(PARTITION by tag ORDER BY dt rows BETWEEN 6 preceding and current row)
  4. 按视频类别降序、日期升序排序 order BY tag desc, dt

知识点

  1. 窗口函数,往常使用窗口函数只用到了前两个参数分组和排序,以至于刚看到题的时候忽略了窗口函数可以取计算范围,浪费了不少代码。计算范围通常写做rows between... preceding and...row
  • 取当前行和前面两行rows between 2 preceding and current row
  • 取当前行和后面两行rows between current row and 1 following
  • 取当前行及之后所有行rows between current row and unbounded following
  • 取当前行及之前所有行 rows between unbounded preceding and current row
  • 取当前行的前三行及后两行rows between 3 preceding and 2 following
select *
from(
    select tag, dt, 
    sum(sum_like_cnt) over(PARTITION by tag ORDER BY dt rows BETWEEN 6 preceding and current row) sum_like_cnt_7d, 
    max(sum_if_retweet_cnt) over(PARTITION by tag ORDER BY dt rows BETWEEN 6 preceding and current row) sum_retweet_cnt_7d
    from(
        select tag, date(start_time) dt, sum(if_like) sum_like_cnt, sum(if_retweet) sum_if_retweet_cnt
        from tb_user_video_log a
        left join tb_video_info b
        on a.video_id = b.video_id
        group by tag, dt
        ) t1
    ) t2
where dt in ('2021-10-01','2021-10-02','2021-10-03')
order BY tag desc, dt