先上全代码,总的来说就是select套了三层

select
    uid,
    d_wy days_window,
   round(count(*)/a_e_cy * d_wy,2) avg_exam_cnt
from (
         select
             uid,
             start_time,
            leade,
             timestampdiff(day,date(start_time),date(leade))+1 d_wyy,
             timestampdiff(day,date(first),date(last))+1 a_e_cyy,
             first_value(timestampdiff(day,date(start_time),date(leade))+1) over(
                 partition by uid
                 order by (timestampdiff(day,date(start_time),date(leade))+1) desc
                 )  d_wy,
             last_value(  timestampdiff(day,date(first),date(last))+1) over(
                 partition by uid
                 order by ( timestampdiff(day,date(first),date(last))+1) desc
                 )  a_e_cy
         FROM (
                  SELECT
                      uid,start_time,
                      first_value(start_time) over(
                          partition by uid
                          order by start_time
                          ) first,
                      lead(start_time,1) over (
                          partition by uid
                          order by start_time
                          ) leade,
                      last_value(start_time) over(
                          partition by uid
                          order by start_time
                          ) last
                  from exam_record
                  where year(start_time) = 2021
              ) t
         )q
where d_wy-1 <> 0
group by
 uid
order by
    days_window desc, round(count(*)/a_e_cy * d_wy,2) desc;
                            

这是最内层形成初始表主要知识点有:

窗口函数

1.first_value() 可以查排序完了第()列第1个,last最后一个

2.lead() 分组中位于当前行后n行(lead)/ 前n行(lag)的记录值

上个图看看就是

alt

因为是今年所以上一个where year(start_time) = 2021

SELECT
                      uid,start_time,
                      first_value(start_time) over(
                          partition by uid
                          order by start_time
                          ) first,
                      lead(start_time,1) over (
                          partition by uid
                          order by start_time
                          ) leade,
                      last_value(start_time) over(
                          partition by uid
                          order by start_time
                          ) last
                  from exam_record
                  where year(start_time) = 2021

第二层就是二次select完了之后计算日期,,就在这里我遇到了一个大坑,那就是不能直接用timestampdiff来计算start_time跟查出来的leade,first,last之间的日期差为什么呢?

来上图! alt

图中第19行 9-2到9-6算的是4天,

有的小伙伴说我没加1,但是你看第六行9-3到9-7 算的是5天,我+1了!!

所以就是时间问题一个是12点到10点,一个就是12点到12点,,

然后正确答案是按照5天算的,这是题目问题,但是没办法,所以我就date(计算的_time)

其实可以直接datediff()我当时猪脑过载了

alt

题出的有问题,,,我找这个错真的是难啊qaq

select
             uid,
             start_time,
            leade,
             timestampdiff(day,date(start_time),date(leade))+1 d_wyy,
             timestampdiff(day,date(first),date(last))+1 a_e_cyy,
             first_value(timestampdiff(day,date(start_time),date(leade))+1) over(
                 partition by uid
                 order by (timestampdiff(day,date(start_time),date(leade))+1) desc
                 )  d_wy,
             last_value(  timestampdiff(day,date(first),date(last))+1) over(
                 partition by uid
                 order by ( timestampdiff(day,date(first),date(last))+1) desc
                 )  a_e_cy

最后一层没啥说的不说了,难受今天这个题真的是让人难受!