SQL33 对试卷得分做min-max归一化

题目主要信息:

  • min-max标准化: ximin(xi)max(xi)min(xi)\frac{x_i - min(x_i)}{max(x_i)-min(x_i)}
  • 将用户作答高难度试卷的得分在每份试卷作答记录内执行min-max归一化后缩放到[0,100][0,100]区间,并输出用户ID、试卷ID、归一化后分数平均值
  • 最后按照试卷ID升序、归一化分数降序输出

问题拆分:

  • 筛选出难度为困难的每个试卷id的最大分数和最小分数:
    • 要筛选的是每个试卷ID下的最大与最小分数,因此要以exam_id分组。知识点:group by
    • 难度信息和得分信息再分别在两个表中,因此通过exam_id将二者连接起来。知识点:join...on...
    • 筛选难度为困难级别的。知识点:where
    • 对每个组求最大最小得分。 max(exam_record.score) as max_score, min(exam_record.score) as min_score 知识点:max()、min()
    • 筛选结果记为id_and_max_min_table
  • 每份试卷的最大最小得分与每个用户的得分分布在两个表中,因此通过exam_id将表id_and_max_min_table与exam_record连接起来。知识点:join...on...
  • 从连接后的表中选出用户ID、试卷ID及每个分数的归一化结果:
    • 求每个用户对于每份试卷的归一化结果,因此要以uid和exam_id分组。 知识点:group by
    • uid和exam_id直接获取。
    • 使用if函数判断该试卷最大最小值是否相等,然后做归一化计算。if(max_score = min_score, score, round(avg((score - min_score) / (max_score - min_score)) * 100)) as avg_new_score 知识点:if、avg()、round()
    • 筛选条件是这个用户的得分不能用空。知识点:where
  • 按照先试卷ID升序后新分数降序的次序输出。order by exam_id asc, avg_new_score desc 知识点:order by

代码:

select uid,
       e_r.exam_id as exam_id,
       if(max_score = min_score, score, round(avg((score - min_score) / (max_score - min_score)) * 100)) as avg_new_score
from exam_record e_r join(
    select exam_record.exam_id,
           max(exam_record.score) as max_score,
           min(exam_record.score) as min_score
    from exam_record join examination_info
    on exam_record.exam_id = examination_info.exam_id
    where examination_info.difficulty = 'hard'
    group by exam_record.exam_id
) id_and_max_min_table
on e_r.exam_id = id_and_max_min_table.exam_id
where score is not null 
group by uid, exam_id
order by exam_id asc, avg_new_score desc