一、明确需求

需求:2021年未完成试卷作答数大于1的有效用户数据
  • 有效用户:试卷作答数>=1 and 未完成作答数<5
  • 作答过的试卷tag集合detail:使用链接group_concat函数
  • 输出标准:用户id,未完成试卷作答数,完成试卷作答数,作答过的试卷tag集合

二、思路

  1. 对exam_recoard选择2021年数据并去重作为原始表t1
  2. 基于t1以uid为分组条件,计算未完成作答数incomplete_cnt和完成作答数complete_cnt,结果作为表t2
  3. 对t2筛选符合条件的数据作为t3
  4. 由表t3和表t1内连接,同时连接表exam_info获取考试tag,计算detail字段作为表t5
  5. 将表t3和表t5左连接

三、代码实现

(排序:由未完成试卷数量 desc)

with
# 2021年每个用户数据去重
t1 as
(
select distinct * 
from exam_record
where year(start_time) = '2021')

# 每个人未完成试卷数
, t2 as
(
select uid
    , sum(if(submit_time is null,1,0)) sta_t   # 未完成作答数
    , count(submit_time) sub_t
from t1
group  by uid)

# 未完成试卷数>1 的有效用户名单
, t3 as
(
select distinct uid,sta_t incomplete_cnt,sub_t complete_cnt
from t2 
where sub_t >=1 and sta_t > 1 and sta_t < 5)

# 符合条件的用户及作答detail
, t5 as
(
select uid
    , group_concat(distinct concat(date_format(start_time,'%Y-%m-%d'),':',tag) separator ';') detail
from 
    (
    select t1.*,t4.tag
    from t3 inner join t1 
        on t3.uid = t1.uid 
    left join examination_info t4 
        on t1.exam_id = t4.exam_id)x 
group by uid 
)
, t6 as
(
select t3.*,t5.detail
from t3 left join t5
    on t3.uid = t5.uid 
order by t3.incomplete_cnt desc)

select * from t6

四、收获

1、group_concat函数

GROUP_CONCAT([DISTINCT] expr [,expr ...]
             [ORDER BY {unsigned_integer | col_name | expr}
                 [ASC | DESC] [,col_name ...]]
             [SEPARATOR str_val])