SQL32 试卷完成数同比2020年的增长率及排名变化

题目主要信息:

  • 计算2021年上半年各类试卷的做完次数相比2020年的增长率(百分比格式,保留1位小数),以及做完次数排名变化,按增长率和21年排名降序输出
  • 试卷信息表examination_info(exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间)
  • 试卷作答记录表exam_record(uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分)

问题拆分:

  • 先找到在2020年上半年或者2021年上半年被完成试卷tag及完成年份:
    • 试卷标签和完成信息记录在两个表中,因此我们要以exam_id为基准将examination_info表和exam_record表连接起来。知识点:join...on...
    • 筛选条件为提交时间不为null且提交时间的年份在2020到2021之间且提交时间的月份不超过6。where submit_time is not null and year(submit_time) between 2020 and 2021 and month(submit_time) <= 6 知识点:where、year()、month()、between
    • 选出tag标签及相应标签的完成年份,记为qualify_year_table
  • 从qualify_year_table中选出试卷标签,年份,以及每个标签每个年份的完成试卷数:
    • 每个试卷标签每个年份单独统计,因此要以标签和年份作为分组。知识点:group by
    • 标签和年份直接获取,试卷完成数量统计每组的数量即可。知识点:count()
    • 筛选出的信息记为year_cnt_table
  • 从year_cnt_table表中再筛选出标签、年份、每个标签下每年的数量和每年完成数量的排名:
    • 前三个可以由year_cnt_table表直接得到
    • 排名我们以年份作为分组,因为是计算每年的排名,按完成试卷数降序排名。rank() over(partition by start_year order by exam_cnt desc) as exam_cnt_rank 知识点:rank() over()、partition by、order by
    • 筛选出的结果记为year_rank_table
  • 从year_rank_table表中筛选出标签及每个标签对应的2020年答题数、排名,2021年答题数、排名:
    • 因为是以每个标签计算,因此要以标签分组。知识点:group by
    • 标签可以由year_rank_table表直接得到。
    • 每个标签2020年完成试卷数,我们只记录年份为2020的试卷数。sum(case when start_year = 2020 then exam_cnt end) as exam_cnt_20 2020年的完成排名我们也只记录年份为2020的排名。sum(case when start_year = 2020 then exam_cnt_rank end) as exam_cnt_rank_20 知识点:sum()、case when...then...end
    • 2021年的情况同上,筛选出的结果记为num_and_rank_of_20_21
  • 从num_and_rank_of_20_21表中筛选出两个年份都有答题时,答题数量、增长率及排名变化:
    • 标签、两年的完成数量,排名信息都可以直接由表num_and_rank_of_20_21得到
    • 筛选条件是两年的数量都不为null。where exam_cnt_21 is not null and exam_cnt_20 is not null
    • 增长率用2021年的数量减去2020年的数量除以2020年的数量,然后乘上100保留一位小数,末尾加上百分号。concat(round((exam_cnt_21 - exam_cnt_20) / exam_cnt_20 * 100, 1), '%') as growth_rate 知识点:concat()、round()
    • 排名变化将2021年的排名转化为signed数字减去2020年的排名转化成的signed数字。(cast(exam_cnt_rank_21 as signed) - cast(exam_cnt_rank_20 as signed)) as rank_delta 知识点:cast
  • 输出按照先增长率再21年排名的降序输出。order by growth_rate desc, exam_cnt_rank_21 desc 知识点:order by

代码:

select tag, exam_cnt_20, exam_cnt_21,
       concat(round((exam_cnt_21 - exam_cnt_20) / exam_cnt_20 * 100, 1), '%') as growth_rate,
       exam_cnt_rank_20, exam_cnt_rank_21,
       (cast(exam_cnt_rank_21 as signed) - cast(exam_cnt_rank_20 as signed)) as rank_delta
from(
    select tag, 
           sum(case when start_year = 2020 then exam_cnt end) as exam_cnt_20,
           sum(case when start_year = 2021 then exam_cnt end) as exam_cnt_21,
           sum(case when start_year = 2020 then exam_cnt_rank end) as exam_cnt_rank_20,
           sum(case when start_year = 2021 then exam_cnt_rank end) as exam_cnt_rank_21
    from(
        select tag, start_year, exam_cnt,
               rank() over(partition by start_year order by exam_cnt desc) as exam_cnt_rank
        from(
            select tag, start_year, 
                   count(*) as exam_cnt
            from(
                select tag, 
                       year(submit_time) as start_year
                from exam_record e_r join examination_info e_i
                on e_r.exam_id = e_i.exam_id
                where submit_time is not null 
                and year(submit_time) between 2020 and 2021
                and month(submit_time) <= 6
            ) qualify_year_table
            group by tag, start_year
        ) year_cnt_table
    ) year_rank_table
group by tag
) num_and_rank_of_20_21
where exam_cnt_21 is not null 
and exam_cnt_20 is not null
order by growth_rate desc, exam_cnt_rank_21 desc