SQL33 对试卷得分做min-max归一化
题目主要信息:
- min-max标准化: max(xi)−min(xi)xi−min(xi)
- 将用户作答高难度试卷的得分在每份试卷作答记录内执行min-max归一化后缩放到[0,100]区间,并输出用户ID、试卷ID、归一化后分数平均值
- 最后按照试卷ID升序、归一化分数降序输出
问题拆分:
- 筛选出难度为困难的每个试卷id的最大分数和最小分数:
- 要筛选的是每个试卷ID下的最大与最小分数,因此要以exam_id分组。知识点:group by
- 难度信息和得分信息再分别在两个表中,因此通过exam_id将二者连接起来。知识点:join...on...
- 筛选难度为困难级别的。知识点:where
- 对每个组求最大最小得分。
max(exam_record.score) as max_score, min(exam_record.score) as min_score
知识点:max()、min()
- 筛选结果记为id_and_max_min_table
- 每份试卷的最大最小得分与每个用户的得分分布在两个表中,因此通过exam_id将表id_and_max_min_table与exam_record连接起来。知识点:join...on...
- 从连接后的表中选出用户ID、试卷ID及每个分数的归一化结果:
- 求每个用户对于每份试卷的归一化结果,因此要以uid和exam_id分组。 知识点:group by
- uid和exam_id直接获取。
- 使用if函数判断该试卷最大最小值是否相等,然后做归一化计算。
if(max_score = min_score, score, round(avg((score - min_score) / (max_score - min_score)) * 100)) as avg_new_score
知识点:if、avg()、round()
- 筛选条件是这个用户的得分不能用空。知识点:where
- 按照先试卷ID升序后新分数降序的次序输出。
order by exam_id asc, avg_new_score desc
知识点:order by
代码:
select uid,
e_r.exam_id as exam_id,
if(max_score = min_score, score, round(avg((score - min_score) / (max_score - min_score)) * 100)) as avg_new_score
from exam_record e_r join(
select exam_record.exam_id,
max(exam_record.score) as max_score,
min(exam_record.score) as min_score
from exam_record join examination_info
on exam_record.exam_id = examination_info.exam_id
where examination_info.difficulty = 'hard'
group by exam_record.exam_id
) id_and_max_min_table
on e_r.exam_id = id_and_max_min_table.exam_id
where score is not null
group by uid, exam_id
order by exam_id asc, avg_new_score desc