SQL32 试卷完成数同比2020年的增长率及排名变化
题目主要信息:
- 计算2021年上半年各类试卷的做完次数相比2020年的增长率(百分比格式,保留1位小数),以及做完次数排名变化,按增长率和21年排名降序输出
- 试卷信息表examination_info(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间)
- 试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)
问题拆分:
- 先找到在2020年上半年或者2021年上半年被完成试卷tag及完成年份:
- 试卷标签和完成信息记录在两个表中,因此我们要以exam_id为基准将examination_info表和exam_record表连接起来。知识点:join...on...
- 筛选条件为提交时间不为null且提交时间的年份在2020到2021之间且提交时间的月份不超过6。
where submit_time is not null and year(submit_time) between 2020 and 2021 and month(submit_time) <= 6
知识点:where、year()、month()、between
- 选出tag标签及相应标签的完成年份,记为qualify_year_table
- 从qualify_year_table中选出试卷标签,年份,以及每个标签每个年份的完成试卷数:
- 每个试卷标签每个年份单独统计,因此要以标签和年份作为分组。知识点:group by
- 标签和年份直接获取,试卷完成数量统计每组的数量即可。知识点:count()
- 筛选出的信息记为year_cnt_table
- 从year_cnt_table表中再筛选出标签、年份、每个标签下每年的数量和每年完成数量的排名:
- 前三个可以由year_cnt_table表直接得到
- 排名我们以年份作为分组,因为是计算每年的排名,按完成试卷数降序排名。
rank() over(partition by start_year order by exam_cnt desc) as exam_cnt_rank
知识点:rank() over()、partition by、order by
- 筛选出的结果记为year_rank_table
- 从year_rank_table表中筛选出标签及每个标签对应的2020年答题数、排名,2021年答题数、排名:
- 因为是以每个标签计算,因此要以标签分组。知识点:group by
- 标签可以由year_rank_table表直接得到。
- 每个标签2020年完成试卷数,我们只记录年份为2020的试卷数。
sum(case when start_year = 2020 then exam_cnt end) as exam_cnt_20
2020年的完成排名我们也只记录年份为2020的排名。sum(case when start_year = 2020 then exam_cnt_rank end) as exam_cnt_rank_20
知识点:sum()、case when...then...end
- 2021年的情况同上,筛选出的结果记为num_and_rank_of_20_21
- 从num_and_rank_of_20_21表中筛选出两个年份都有答题时,答题数量、增长率及排名变化:
- 标签、两年的完成数量,排名信息都可以直接由表num_and_rank_of_20_21得到
- 筛选条件是两年的数量都不为null。
where exam_cnt_21 is not null and exam_cnt_20 is not null
- 增长率用2021年的数量减去2020年的数量除以2020年的数量,然后乘上100保留一位小数,末尾加上百分号。
concat(round((exam_cnt_21 - exam_cnt_20) / exam_cnt_20 * 100, 1), '%') as growth_rate
知识点:concat()、round()
- 排名变化将2021年的排名转化为signed数字减去2020年的排名转化成的signed数字。
(cast(exam_cnt_rank_21 as signed) - cast(exam_cnt_rank_20 as signed)) as rank_delta
知识点:cast
- 输出按照先增长率再21年排名的降序输出。
order by growth_rate desc, exam_cnt_rank_21 desc
知识点:order by
代码:
select tag, exam_cnt_20, exam_cnt_21,
concat(round((exam_cnt_21 - exam_cnt_20) / exam_cnt_20 * 100, 1), '%') as growth_rate,
exam_cnt_rank_20, exam_cnt_rank_21,
(cast(exam_cnt_rank_21 as signed) - cast(exam_cnt_rank_20 as signed)) as rank_delta
from(
select tag,
sum(case when start_year = 2020 then exam_cnt end) as exam_cnt_20,
sum(case when start_year = 2021 then exam_cnt end) as exam_cnt_21,
sum(case when start_year = 2020 then exam_cnt_rank end) as exam_cnt_rank_20,
sum(case when start_year = 2021 then exam_cnt_rank end) as exam_cnt_rank_21
from(
select tag, start_year, exam_cnt,
rank() over(partition by start_year order by exam_cnt desc) as exam_cnt_rank
from(
select tag, start_year,
count(*) as exam_cnt
from(
select tag,
year(submit_time) as start_year
from exam_record e_r join examination_info e_i
on e_r.exam_id = e_i.exam_id
where submit_time is not null
and year(submit_time) between 2020 and 2021
and month(submit_time) <= 6
) qualify_year_table
group by tag, start_year
) year_cnt_table
) year_rank_table
group by tag
) num_and_rank_of_20_21
where exam_cnt_21 is not null
and exam_cnt_20 is not null
order by growth_rate desc, exam_cnt_rank_21 desc