WITH t1 AS( SELECT t1.video_id, SUM(if_like) AS like_cnt, SUM(IF(comment_id IS NULL, 0, 1)) AS comment_cnt, SUM(if_retweet) AS retweet_cnt, AVG(IF(TIMESTAMPDIFF(second, start_time, end_time) >= duration, 1, 0)) AS finish_rate, 1 / (DATEDIFF((SELECT DATE(MAX(end_time)) FROM tb_user_video_log), DATE(MAX(end_time))) + 1) AS fresh_rate FROM tb_user_video_log t1 JOIN tb_video_info t2 ON t1.video_id = t2.video_id WHERE release_time >= DATE_ADD((SELECT DATE(MAX(end_time)) FROM tb_user_video_log), INTERVAL -29 DAY) GROUP BY t1.video_id ) SELECT video_id, ROUND((100*finish_rate + 5*like_cnt + 3*comment_cnt + 2*retweet_cnt) * fresh_rate, 0) AS hot_index FROM t1 ORDER BY hot_index DESC LIMIT 3
总体思路为按照题干里提到的几个元素一个一个算。点赞数、评论数、转发数和 完播率都可以使用聚合函数 + IF条件解决。新鲜率需要计算一个时间差,需要注意要提取所有记录中某属性的最大值不能直接使用MAX,而是应该用一个小的子查询括起来,本题中是(SELECT DATE(MAX(end_time)) FROM tb_user_video_log),在下面的WHERE过滤条件中也是如此。
在把需要使用的字段全部准备好之后,在主查询中按照公式计算出热度,再排序和选取前三条记录即可。