select count(if(tomorrow is not null,1,null))/count(first_day) as avg_ret from( select qpd1.device_id, qpd1.date as first_day, qpd2.date as tomorrow, row_number() over(partition by qpd1.device_id,qpd1.date) as rn from question_practice_detail qpd1 left join question_practice_detail qpd2 on qpd2.date=DATE_ADD(qpd1.date, interval 1 day) and qpd1.device_id=qpd2.device_id ) as date_compare where rn=1
首先需要通过将表与自身的left join的date这一列做对比,获得哪些用户第二天又来了的记录,第二天没来tomorrow这一列的值就是null,但是由于由于用户每天来做多道题会留下多条记录,因此需要使用row_number函数和where语句筛选出同一用户在同一天答题的唯一记录,记录的总数是留存率的分母,而tomorrow这一列的非null值的个数就是分子,从而计算得到留存率。