究极冗长版,纯粹自己的思考过程
step 1: 首先筛选出符合条件的区域范围,需要将三个表全部联合,放到CTE中,筛选条件 shop_id = 901,时间范围最早2021-09-25,最晚2021-10-03
WITH sale_info AS (SELECT td.product_id, DATE(too.event_time) AS dt_order, DATE(tp.release_time) AS dt_release FROM tb_order_overall too LEFT JOIN tb_order_detail td ON too.order_id = td.order_id LEFT JOIN tb_product_info tp ON td.product_id = tp.product_id WHERE tp.shop_id = 901 AND DATE(too.event_time) >= "2021-09-25" AND DATE(too.event_time) <= "2021-10-03")
step 2: 我希望最终能得到如下的表格,这样我就可以直接进行计算和得到结果,并且一整个思路都是为了得到这个表格
- NO. sales / No.release --> 动销率
- 1-动销率 --> 滞销率
DATE | NUMBER OF SALES | NUMBER OF RELEASE |
Step 2.1. 找到所有日期下No. sales, 实施的时候反复陷入以下误区
- 如果tb_order_detail没有出现10-01到10-03的数据,输出结果会直接将他们扔出去
- COUNT(DISTICT ...) OVER( )不存在
我希望在计算的时候呈现如下的表格,这样我就可以通过COUNT(DISTINCT t2.product_id)来得到10-01 -- 10-03 所有的在售产品数量。t1.dt中指包含了10-01到10-03, t2.dt_order来源于tb_order_overall
t1.dt | t2.dt_order | t2.product_id |
10-01 | 09-25 | 801 |
10-01 | 09-27 | 801 |
10-01 | 10-01 | 802 |
10-02 | 09-29 | 801 |
10-02 | 10-02 | 802 |
10-03 | NULL | NULL |
#date_series-->为了解决问题1:但这个date——series的创建很愚蠢,一旦数量变多拓展到10-30或者更远,就会很离谱,不知道怎么解决 date_series AS (SELECT "2021-10-01" AS dt UNION SELECT "2021-10-02" AS dt UNION SELECT "2021-10-03" AS dt), #通过left join来保留所有10-01到10-03的信息 sale_num AS (SELECT DISTINCT s1.dt AS dt_order, COUNT(DISTINCT IF(s2.dt_order >= dt_release, s2.product_id, NULL)) AS sale_num # dt_order >=dt_release 为了筛选那些购买了没有上架的产品,感觉这个纯粹多想了 FROM date_series s1 LEFT JOIN sale_info s2 ON s2.dt_order BETWEEN DATE_SUB(s1.dt, INTERVAL 6 DAY) AND s1.dt #通过这个方式来得到和窗口函数同样的效果 GROUP BY s1.dt),
Step 2.2. 找到No.release
因为no.sales的表格已经做好了,所以no.release可以不用管日期直接left join上面建好的no.sale表格就可以
final AS (SELECT sn.dt_order, sn.sale_num, COUNT(DISTINCT tbi.product_id) AS rolling_release FROM sale_num sn LEFT JOIN tb_product_info tbi ON sn.dt_order >= DATE(tbi.release_time) WHERE tbi.shop_id = 901 GROUP BY sn.dt_order, sn.sale_num)
Step 2.3: 最后一步就是计算动销率和滞销率
- 只要当天任一店铺有任何商品的销量就输出该天的结果,即使店铺901当天的动销率为0。-->所以在where语句中需要筛选出符合条件的日期
SELECT dt_order AS dt, ROUND(sale_num/rolling_release, 3) AS sale_rate, ROUND(1 - sale_num/rolling_release, 3) AS unsale_rate FROM final WHERE dt_order IN (SELECT DISTINCT DATE(event_time) FROM tb_order_overall #找到10-01 -- 10-03期间是否有任意商店有销量 WHERE DATE(event_time) BETWEEN "2021-10-01" AND "2021-10-03") ORDER BY dt;
最后的汇总代码
WITH sale_info AS (SELECT td.product_id, DATE(too.event_time) AS dt_order, DATE(tp.release_time) AS dt_release FROM tb_order_overall too LEFT JOIN tb_order_detail td ON too.order_id = td.order_id LEFT JOIN tb_product_info tp ON td.product_id = tp.product_id WHERE tp.shop_id = 901 AND DATE(too.event_time) >= "2021-09-25" AND DATE(too.event_time) <= "2021-10-03"), ######################### date_series AS (SELECT "2021-10-01" AS dt UNION SELECT "2021-10-02" AS dt UNION SELECT "2021-10-03" AS dt), #################### sale_num AS (SELECT DISTINCT s1.dt AS dt_order, #可能没有10-01的数据但是有09-25的 COUNT(DISTINCT IF(s2.dt_order >= dt_release, s2.product_id, NULL)) AS sale_num FROM date_series s1 LEFT JOIN sale_info s2 ON s2.dt_order BETWEEN DATE_SUB(s1.dt, INTERVAL 6 DAY) AND s1.dt GROUP BY s1.dt), ################## final AS (SELECT sn.dt_order, sn.sale_num, COUNT(DISTINCT tbi.product_id) AS rolling_release FROM sale_num sn LEFT JOIN tb_product_info tbi ON sn.dt_order >= DATE(tbi.release_time) WHERE tbi.shop_id = 901 GROUP BY sn.dt_order, sn.sale_num) ################################### SELECT dt_order AS dt, ROUND(sale_num/rolling_release, 3) AS sale_rate, ROUND(1 - sale_num/rolling_release, 3) AS unsale_rate FROM final WHERE dt_order IN (SELECT DISTINCT DATE(event_time) FROM tb_order_overall WHERE DATE(event_time) BETWEEN "2021-10-01" AND "2021-10-03") ORDER BY dt;