问题拆分

1.找到各类试卷tag在2020年与2021年上半年的完成次数

    1. 试卷完成信息出现在exam_record表中,试卷类别出现在examination_info中,因此通过right join或者left join两表形成表t1
....
		exam_record t1
    right join
        examination_info t2
    on t1.exam_id = t2.exam_id
....
    1. 由于需要2020年与2021年上半年的数据,在连接表时,使用where筛选submit_time
....
    where
        year(submit_time) in (2021, 2020)
    and
        month(submit_time)<=6
 ....
  • 3.筛选数据后,使用group by 依次对类别tag和答题时间submit_time进行分组,使用count(*)统计每类试卷每年的答题数量;
....
count(*) exam_cnt,
....
group by
        tag,year(submit_time)
....
  • 4.使用over()rank()对每年的各类试卷进行排名,按照试卷完成数倒序排名;
....
rank() over(partition by year(submit_time) order by count(*) desc) exam_cnt_rank
....

形成t1

	select
        tag,
        year(submit_time) start_year,
        count(*) exam_cnt,
        rank() over(partition by year(submit_time) order by count(*) desc) exam_cnt_rank
    from
        exam_record t1
    right join
        examination_info t2
    on t1.exam_id = t2.exam_id
    where
        year(submit_time) in (2021, 2020)
        and
        month(submit_time)<=6
    group by
        tag,year(submit_time)

2.统计同比增长率以及排名变化(在t1表中计算即可)

  • 主要思路:t1中的每类试卷有2020年以及2021年的完成数量,有两行数据,对tag使用group by分组变成一行数据,按照年份start_year使用if()和sum()嵌套进行计算。
  • 注意!!!为保证同类试卷在2020年与2021年都有数据,需要对group by 后的数据使用having进行筛选,满足count(*)=2

eg:计算2020年的试卷完成数量

sum(if(start_year=2020, exam_cnt, 0)) exam_cnt_20

eg:计算同比增长率(2021年完成数量减去2020年完成数量,再除以2020年完成数量)

sum(if(start_year=2020, -exam_cnt, exam_cnt))/sum(if(start_year=2020, exam_cnt, 0))

  • 该部分统计代码
	sum(if(start_year=2020, exam_cnt, 0)) exam_cnt_20,
    sum(if(start_year=2021, exam_cnt, 0)) exam_cnt_21,
    concat(round(100*sum(if(start_year=2020, -exam_cnt, exam_cnt))/sum(if(start_year=2020, exam_cnt, 0)),1), '%') growth_rate,
    sum(if(start_year=2020, exam_cnt_rank, 0)) exam_cnt_rank_20,
    sum(if(start_year=2021, exam_cnt_rank, 0)) exam_cnt_rank_21,
    sum(if(start_year=2020, -exam_cnt_rank, exam_cnt_rank)) rank_delta

总体代码

select
    tag,
    sum(if(start_year=2020, exam_cnt, 0)) exam_cnt_20,
    sum(if(start_year=2021, exam_cnt, 0)) exam_cnt_21,
    concat(round(100*sum(if(start_year=2020, -exam_cnt, exam_cnt))/sum(if(start_year=2020, exam_cnt, 0)),1), '%') growth_rate,
    sum(if(start_year=2020, exam_cnt_rank, 0)) exam_cnt_rank_20,
    sum(if(start_year=2021, exam_cnt_rank, 0)) exam_cnt_rank_21,
    sum(if(start_year=2020, -exam_cnt_rank, exam_cnt_rank)) rank_delta
from
    (select
        tag,
        year(submit_time) start_year,
        count(*) exam_cnt,
        rank() over(partition by year(submit_time) order by count(*) desc) exam_cnt_rank
    from
        exam_record t1
    right join
        examination_info t2
    on t1.exam_id = t2.exam_id
    where
        year(submit_time) in (2021, 2020)
        and
        month(submit_time)<=6
    group by
        tag,year(submit_time)) t1
group by
    tag
having
    count(*) = 2
order by
    growth_rate desc, exam_cnt_rank_21 desc;