究极冗长版,纯粹自己的思考过程
step 1: 首先筛选出符合条件的区域范围,需要将三个表全部联合,放到CTE中,筛选条件 shop_id = 901,时间范围最早2021-09-25,最晚2021-10-03
WITH sale_info AS
(SELECT td.product_id,
DATE(too.event_time) AS dt_order,
DATE(tp.release_time) AS dt_release
FROM tb_order_overall too
LEFT JOIN tb_order_detail td
ON too.order_id = td.order_id
LEFT JOIN tb_product_info tp
ON td.product_id = tp.product_id
WHERE tp.shop_id = 901
AND DATE(too.event_time) >= "2021-09-25"
AND DATE(too.event_time) <= "2021-10-03")
step 2: 我希望最终能得到如下的表格,这样我就可以直接进行计算和得到结果,并且一整个思路都是为了得到这个表格
- NO. sales / No.release --> 动销率
- 1-动销率 --> 滞销率
DATE | NUMBER OF SALES | NUMBER OF RELEASE |
Step 2.1. 找到所有日期下No. sales, 实施的时候反复陷入以下误区
- 如果tb_order_detail没有出现10-01到10-03的数据,输出结果会直接将他们扔出去
- COUNT(DISTICT ...) OVER( )不存在
我希望在计算的时候呈现如下的表格,这样我就可以通过COUNT(DISTINCT t2.product_id)来得到10-01 -- 10-03 所有的在售产品数量。t1.dt中指包含了10-01到10-03, t2.dt_order来源于tb_order_overall
t1.dt | t2.dt_order | t2.product_id |
10-01 | 09-25 | 801 |
10-01 | 09-27 | 801 |
10-01 | 10-01 | 802 |
10-02 | 09-29 | 801 |
10-02 | 10-02 | 802 |
10-03 | NULL | NULL |
#date_series-->为了解决问题1:但这个date——series的创建很愚蠢,一旦数量变多拓展到10-30或者更远,就会很离谱,不知道怎么解决
date_series AS
(SELECT "2021-10-01" AS dt
UNION
SELECT "2021-10-02" AS dt
UNION
SELECT "2021-10-03" AS dt),
#通过left join来保留所有10-01到10-03的信息
sale_num AS
(SELECT DISTINCT s1.dt AS dt_order,
COUNT(DISTINCT IF(s2.dt_order >= dt_release, s2.product_id, NULL)) AS sale_num
# dt_order >=dt_release 为了筛选那些购买了没有上架的产品,感觉这个纯粹多想了
FROM date_series s1
LEFT JOIN sale_info s2
ON s2.dt_order BETWEEN DATE_SUB(s1.dt, INTERVAL 6 DAY) AND s1.dt
#通过这个方式来得到和窗口函数同样的效果
GROUP BY s1.dt),
Step 2.2. 找到No.release
因为no.sales的表格已经做好了,所以no.release可以不用管日期直接left join上面建好的no.sale表格就可以
final AS
(SELECT sn.dt_order,
sn.sale_num,
COUNT(DISTINCT tbi.product_id) AS rolling_release
FROM sale_num sn
LEFT JOIN tb_product_info tbi
ON sn.dt_order >= DATE(tbi.release_time)
WHERE tbi.shop_id = 901
GROUP BY sn.dt_order, sn.sale_num)
Step 2.3: 最后一步就是计算动销率和滞销率
- 只要当天任一店铺有任何商品的销量就输出该天的结果,即使店铺901当天的动销率为0。-->所以在where语句中需要筛选出符合条件的日期
SELECT dt_order AS dt,
ROUND(sale_num/rolling_release, 3) AS sale_rate,
ROUND(1 - sale_num/rolling_release, 3) AS unsale_rate
FROM final
WHERE dt_order IN (SELECT DISTINCT DATE(event_time)
FROM tb_order_overall #找到10-01 -- 10-03期间是否有任意商店有销量
WHERE DATE(event_time) BETWEEN "2021-10-01" AND "2021-10-03")
ORDER BY dt;
最后的汇总代码
WITH sale_info AS
(SELECT td.product_id,
DATE(too.event_time) AS dt_order,
DATE(tp.release_time) AS dt_release
FROM tb_order_overall too
LEFT JOIN tb_order_detail td
ON too.order_id = td.order_id
LEFT JOIN tb_product_info tp
ON td.product_id = tp.product_id
WHERE tp.shop_id = 901
AND DATE(too.event_time) >= "2021-09-25"
AND DATE(too.event_time) <= "2021-10-03"),
#########################
date_series AS
(SELECT "2021-10-01" AS dt
UNION
SELECT "2021-10-02" AS dt
UNION
SELECT "2021-10-03" AS dt),
####################
sale_num AS
(SELECT DISTINCT s1.dt AS dt_order, #可能没有10-01的数据但是有09-25的
COUNT(DISTINCT IF(s2.dt_order >= dt_release, s2.product_id, NULL)) AS sale_num
FROM date_series s1
LEFT JOIN sale_info s2
ON s2.dt_order BETWEEN DATE_SUB(s1.dt, INTERVAL 6 DAY) AND s1.dt
GROUP BY s1.dt),
##################
final AS
(SELECT sn.dt_order,
sn.sale_num,
COUNT(DISTINCT tbi.product_id) AS rolling_release
FROM sale_num sn
LEFT JOIN tb_product_info tbi
ON sn.dt_order >= DATE(tbi.release_time)
WHERE tbi.shop_id = 901
GROUP BY sn.dt_order, sn.sale_num)
###################################
SELECT dt_order AS dt,
ROUND(sale_num/rolling_release, 3) AS sale_rate,
ROUND(1 - sale_num/rolling_release, 3) AS unsale_rate
FROM final
WHERE dt_order IN (SELECT DISTINCT DATE(event_time)
FROM tb_order_overall
WHERE DATE(event_time) BETWEEN "2021-10-01" AND "2021-10-03")
ORDER BY dt;

京公网安备 11010502036488号