一、明确需求
需求:2021年未完成试卷作答数大于1的有效用户数据
- 有效用户:试卷作答数>=1 and 未完成作答数<5
- 作答过的试卷tag集合detail:使用链接group_concat函数
- 输出标准:用户id,未完成试卷作答数,完成试卷作答数,作答过的试卷tag集合
二、思路
- 对exam_recoard选择2021年数据并去重作为原始表t1
- 基于t1以uid为分组条件,计算未完成作答数incomplete_cnt和完成作答数complete_cnt,结果作为表t2
- 对t2筛选符合条件的数据作为t3
- 由表t3和表t1内连接,同时连接表exam_info获取考试tag,计算detail字段作为表t5
- 将表t3和表t5左连接
三、代码实现
(排序:由未完成试卷数量 desc)
with
# 2021年每个用户数据去重
t1 as
(
select distinct *
from exam_record
where year(start_time) = '2021')
# 每个人未完成试卷数
, t2 as
(
select uid
, sum(if(submit_time is null,1,0)) sta_t # 未完成作答数
, count(submit_time) sub_t
from t1
group by uid)
# 未完成试卷数>1 的有效用户名单
, t3 as
(
select distinct uid,sta_t incomplete_cnt,sub_t complete_cnt
from t2
where sub_t >=1 and sta_t > 1 and sta_t < 5)
# 符合条件的用户及作答detail
, t5 as
(
select uid
, group_concat(distinct concat(date_format(start_time,'%Y-%m-%d'),':',tag) separator ';') detail
from
(
select t1.*,t4.tag
from t3 inner join t1
on t3.uid = t1.uid
left join examination_info t4
on t1.exam_id = t4.exam_id)x
group by uid
)
, t6 as
(
select t3.*,t5.detail
from t3 left join t5
on t3.uid = t5.uid
order by t3.incomplete_cnt desc)
select * from t6
四、收获
1、group_concat函数
GROUP_CONCAT([DISTINCT] expr [,expr ...]
[ORDER BY {unsigned_integer | col_name | expr}
[ASC | DESC] [,col_name ...]]
[SEPARATOR str_val])