明确题意:

统计2021年每个未完成试卷作答数大于1的有效用户的数据(有效用户指完成试卷作答数至少为1且未完成数小于5),输出用户ID、未完成试卷作答数、完成试卷作答数、作答过的试卷tag集合,按未完成试卷数量由多到少排序


问题分解:

  • 关联作答记录和试卷信息:left join examination_info on using(exam_id);(题中exam_record中的exam_id在examination_info均存在,所以用left join和inner join效果一样)
  • 筛选2021年的记录:where year(start_time)=2021
  • 获取各用户的tag,start_time及未完成标记和已完成标记,如果该作答记录交卷了则已完成标记为1,未完成标记为0,否则相反:if(submit_time is null, 1, null) as incomplete
  • 按用户分组:group by uid
  • 统计未完成试卷作答数和已完成试卷作答数:count(incomplete) as incomplete_cnt
  • 统计作答过的tag集合:
    • 对于每条作答tag,用:连接日期和tag:concat_ws(':', date(start_time), tag)
    • 对于一个人(组内)的多条作答,用;连接去重后的作答记录:group_concat(distinct concat_ws(':', date(start_time), tag) SEPARATOR ';')
  • 筛选未完成试卷作答数大于1的有效用户:having complete_cnt >= 1 and incomplete_cnt BETWEEN 2 and 4
    • 完成试卷作答数至少为1:complete_cnt >= 1
    • 未完成数小于5:incomplete_cnt < 5
    • 未完成试卷作答数大于1:incomplete_cnt > 1

细节问题:

  • 表头重命名:as
  • 按未完成试卷数量由多到少排序:order by incomplete_cnt DESC

完整代码:

SELECT uid, count(incomplete) as incomplete_cnt,
    count(complete) as complete_cnt,
    group_concat(distinct concat_ws(':', date(start_time), tag) SEPARATOR ';') as detail
from (
    SELECT uid, tag, start_time,
        if(submit_time is null, 1, null) as incomplete,
        if(submit_time is null, null, 1) as complete
    from exam_record 
    left join examination_info using(exam_id)
    where year(start_time)=2021
) as exam_complete_rec
group by uid
having complete_cnt >= 1 and incomplete_cnt BETWEEN 2 and 4
order by incomplete_cnt DESC