题解 | 每个月Top3的周杰伦歌曲

with t as (select month(fdate) as `month`,
row_number() over (partition by month(fdate) order by count(b.song_id)desc,a.song_id asc) as ranking,
song_name,
count(b.song_id) as play_pv
from  play_log a
left join song_info b on a.song_id = b.song_id
left join user_info c on a.user_id = c.user_id
where singer_name='周杰伦' and (age between 18 and 25) and year(fdate)=2022
group by month(fdate),song_name,a.song_id)

select * from t
where ranking <=3

要理解这段 SQL 中 row_number() 窗口函数和 group by 的作用，我们可以分两部分拆解分析：

一、`row_number() over (...)` 窗口函数部分

row_number() over (partition by month(fdate) order by count(b.song_id)desc,a.song_id asc) as ranking 是窗口函数的核心逻辑，用于实现“每月内歌曲的排名”：

row_number()：为每一行数据生成一个唯一的序号（从 1 开始）。
partition by month(fdate)：按 fdate 的月份“分区”，即每个月的数据独立计算排名（1 月的排名和 2 月的排名互不影响）。
order by count(b.song_id)desc, a.song_id asc：定义排名的“排序规则”：先按 count(b.song_id)（歌曲播放次数）降序排列（播放量越高，排名越靠前）；若播放次数相同，再按 a.song_id升序排列（保证排序的确定性）。

二、`group by month(fdate),song_name,a.song_id` 分组部分

group by month(fdate),song_name,a.song_id 是聚合函数的分组依据，用于确保“每首歌每月的播放量被正确统计”：

group by 的作用：SELECT 中使用了聚合函数 count(b.song_id)（统计播放次数），因此必须通过 group by 指定“分组维度”，否则 SQL 会报错。
分组字段的意义： month(fdate)：按“月份”分组，确保统计的是每月的播放数据；song_name + a.song_id：按“歌曲名称 + 歌曲 ID”分组，确保是同一首歌的播放数据（避免不同歌曲重名导致统计错误）。

整体逻辑串联

这段 SQL 的目标是**“统计 2022 年周杰伦的歌曲中，18-25 岁用户每月播放量前 3 的歌曲”**：

先通过 group by month(fdate), song_name, a.song_id 分组，统计每首歌每月的播放次数（play_pv）；
再通过 row_number() over (partition by month(fdate) order by ...) 窗口函数，在每个月内对歌曲按“播放量降序 + song_id 升序”排名，生成 ranking；
最后筛选出 ranking <= 3 的记录，即“每月播放量前 3 的歌曲”。

这样设计的好处是：既通过 group by 保证了播放量统计的准确性，又通过窗口函数实现了“每月 Top3”的业务需求，逻辑清晰且高效。

题解 | 每个月Top3的周杰伦歌曲

一、row_number() over (...) 窗口函数部分

二、group by month(fdate),song_name,a.song_id 分组部分

整体逻辑串联

一、`row_number() over (...)` 窗口函数部分

二、`group by month(fdate),song_name,a.song_id` 分组部分