【场景】:每天一周内的数据变化情况
【分类】:窗口平移、rows n preceding
分析思路
难点:
1.国庆头3天,每天一周内的数据变化情况
(1)统计2021年9月25到10月3日每天的点赞量和转发量
-
[条件]:start_time < '2021-10-04' and start_time >= '2021-09-25'
-
[使用]:group by tag,dt;sum
(2)聚集窗口函数的应用,滑动7天
需要用到高级窗口函数
- [使用]:sum(like_cnt) over (partition by tag order by dt rows 6 preceding);max(retweet_cnt) over (partition by tag order by dt rows 6 preceding)
(3)统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量
需要用到聚合窗口函数
- [使用]:sum(follow_add-follow_sub) over(partition by author order by month)
求解代码
方法一:
with 子句
with
main as(
#统计2021年9月25到10月3日每天的点赞量和转发量
select
tag,
date(start_time) as dt,
sum(if_like) as like_cnt,
sum(if_retweet) as retweet_cnt
from tb_user_video_log a,tb_video_info b
where a.video_id = b.video_id
and start_time < '2021-10-04'
and start_time >= '2021-09-25'
group by tag,dt
order by dt
)
,main1 as(
#聚集窗口函数的应用,滑动7天
select
tag,
dt,
sum(like_cnt) over (partition by tag order by dt rows 6 preceding) as sum_like_cnt_7d,
max(retweet_cnt) over (partition by tag order by dt rows 6 preceding) as max_retweet_cnt_7d
from main
)
#统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量
select
tag,
dt,
sum_like_cnt_7d,
max_retweet_cnt_7d
from main1
where dt between '2021-10-01' and '2021-10-03'
order by tag desc,dt
方法二:
from嵌套子查询
#统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量
select
tag,
dt,
sum_like_cnt_7d,
max_retweet_cnt_7d
from(
#聚集窗口函数的应用,滑动7天
select
tag,
dt,
sum(like_cnt) over (partition by tag order by dt rows 6 preceding) as sum_like_cnt_7d,
max(retweet_cnt) over (partition by tag order by dt rows 6 preceding) as max_retweet_cnt_7d
from(
#统计2021年9月25到10月3日每天的点赞量和转发量
select
tag,
date(start_time) as dt,
sum(if_like) as like_cnt,
sum(if_retweet) as retweet_cnt
from tb_user_video_log a,tb_video_info b
where a.video_id = b.video_id
and start_time < '2021-10-04'
and start_time >= '2021-09-25'
group by tag,dt
order by dt
) main
) main1
where dt between '2021-10-01' and '2021-10-03'
order by tag desc,dt