究极冗长版,纯粹自己的思考过程

step 1: 首先筛选出符合条件的区域范围,需要将三个表全部联合,放到CTE中,筛选条件 shop_id = 901,时间范围最早2021-09-25,最晚2021-10-03

WITH sale_info AS
(SELECT td.product_id,
        DATE(too.event_time) AS dt_order,
        DATE(tp.release_time) AS dt_release
   FROM tb_order_overall too
LEFT JOIN tb_order_detail td 
     ON too.order_id = td.order_id
LEFT JOIN tb_product_info tp
     ON td.product_id = tp.product_id
  WHERE tp.shop_id = 901
    AND DATE(too.event_time) >= "2021-09-25"
    AND DATE(too.event_time) <= "2021-10-03")

step 2: 我希望最终能得到如下的表格,这样我就可以直接进行计算和得到结果,并且一整个思路都是为了得到这个表格

  • NO. sales / No.release --> 动销率
  • 1-动销率 --> 滞销率

DATE

NUMBER OF SALES

NUMBER OF RELEASE

Step 2.1. 找到所有日期下No. sales, 实施的时候反复陷入以下误区

  • 如果tb_order_detail没有出现10-01到10-03的数据,输出结果会直接将他们扔出去
  • COUNT(DISTICT ...) OVER( )不存在

我希望在计算的时候呈现如下的表格,这样我就可以通过COUNT(DISTINCT t2.product_id)来得到10-01 -- 10-03 所有的在售产品数量。t1.dt中指包含了10-01到10-03, t2.dt_order来源于tb_order_overall

t1.dt

t2.dt_order

t2.product_id

10-01

09-25

801

10-01

09-27

801

10-01

10-01

802

10-02

09-29

801

10-02

10-02

802

10-03

NULL

NULL

#date_series-->为了解决问题1:但这个date——series的创建很愚蠢,一旦数量变多拓展到10-30或者更远,就会很离谱,不知道怎么解决
date_series AS
(SELECT "2021-10-01" AS dt
   UNION
 SELECT "2021-10-02" AS dt
   UNION
 SELECT "2021-10-03" AS dt),

#通过left join来保留所有10-01到10-03的信息
sale_num AS
(SELECT DISTINCT s1.dt AS dt_order, 
        COUNT(DISTINCT IF(s2.dt_order >= dt_release, s2.product_id, NULL)) AS sale_num
    # dt_order >=dt_release 为了筛选那些购买了没有上架的产品,感觉这个纯粹多想了
   FROM date_series s1
 LEFT JOIN sale_info s2
     ON s2.dt_order BETWEEN DATE_SUB(s1.dt, INTERVAL 6 DAY) AND s1.dt
 #通过这个方式来得到和窗口函数同样的效果
  GROUP BY s1.dt),

Step 2.2. 找到No.release

因为no.sales的表格已经做好了,所以no.release可以不用管日期直接left join上面建好的no.sale表格就可以

 final AS
 (SELECT sn.dt_order,
         sn.sale_num,
         COUNT(DISTINCT tbi.product_id) AS rolling_release
    FROM sale_num sn
 LEFT JOIN tb_product_info tbi 
        ON sn.dt_order >= DATE(tbi.release_time)
 WHERE tbi.shop_id = 901 
  GROUP BY sn.dt_order, sn.sale_num)

Step 2.3: 最后一步就是计算动销率和滞销率

  • 只要当天任一店铺有任何商品的销量就输出该天的结果,即使店铺901当天的动销率为0。-->所以在where语句中需要筛选出符合条件的日期
SELECT dt_order AS dt,
       ROUND(sale_num/rolling_release, 3) AS sale_rate,
       ROUND(1 - sale_num/rolling_release, 3) AS unsale_rate
  FROM final
WHERE dt_order IN (SELECT DISTINCT DATE(event_time)
          FROM tb_order_overall #找到10-01 -- 10-03期间是否有任意商店有销量
        WHERE DATE(event_time) BETWEEN "2021-10-01" AND "2021-10-03")
ORDER BY dt;

最后的汇总代码

WITH sale_info AS
(SELECT td.product_id,
        DATE(too.event_time) AS dt_order,
        DATE(tp.release_time) AS dt_release
   FROM tb_order_overall too
LEFT JOIN tb_order_detail td 
     ON too.order_id = td.order_id
LEFT JOIN tb_product_info tp
     ON td.product_id = tp.product_id
  WHERE tp.shop_id = 901
    AND DATE(too.event_time) >= "2021-09-25"
    AND DATE(too.event_time) <= "2021-10-03"),
#########################
date_series AS
(SELECT "2021-10-01" AS dt
   UNION
 SELECT "2021-10-02" AS dt
   UNION
 SELECT "2021-10-03" AS dt),
####################
sale_num AS
(SELECT DISTINCT s1.dt AS dt_order, #可能没有10-01的数据但是有09-25的
        COUNT(DISTINCT IF(s2.dt_order >= dt_release, s2.product_id, NULL)) AS sale_num
   FROM date_series s1
 LEFT JOIN sale_info s2
     ON s2.dt_order BETWEEN DATE_SUB(s1.dt, INTERVAL 6 DAY) AND s1.dt
  GROUP BY s1.dt),
##################
 final AS
 (SELECT sn.dt_order,
         sn.sale_num,
         COUNT(DISTINCT tbi.product_id) AS rolling_release
    FROM sale_num sn
 LEFT JOIN tb_product_info tbi 
        ON sn.dt_order >= DATE(tbi.release_time)
 WHERE tbi.shop_id = 901 
  GROUP BY sn.dt_order, sn.sale_num)
  
  ###################################

SELECT dt_order AS dt,
       ROUND(sale_num/rolling_release, 3) AS sale_rate,
       ROUND(1 - sale_num/rolling_release, 3) AS unsale_rate
  FROM final
WHERE dt_order IN (SELECT DISTINCT DATE(event_time)
          FROM tb_order_overall
        WHERE DATE(event_time) BETWEEN "2021-10-01" AND "2021-10-03")
ORDER BY dt;