最近做了阿里的24年春招题,题目如下: alt

alt alt

题目分析

在这道题目中,我们需要处理三个表,表中会用到的信息包括:

  • orders(订单表)

    • order_id:订单编号
    • product_id:产品编号
    • customer_id:客户编号
    • quantity:购买数量
  • products(产品表)

    • product_id:产品编号
    • unit_price:单价
  • customers(客户表)

    • customer_id:客户编号
    • customer_age:客户年龄

目标

  • 计算每个产品2023年的总销售额、平均月销售额、单月最高销量、最大月销售量,并找出每个产品的最大购买量的客户的年龄段。

输出要求

  • 按照每个产品的总销售额降序排列。
  • 如果总销售额一致,则按照产品的ID升序排列。
  • 当存在两个客户年购买量都是最高时,customer_age_group展示年龄小的顾客的年龄段。

知识点关键词:窗口函数、分组、连接、条件判断

解答步骤

步骤1:计算每个产品的年度销售数据

我们首先需要计算每个产品的总销售额、总购买量、平均月销售额和最大月购买量。

  • 窗口函数:使用 sum()max() 窗口函数来计算每个产品的总购买量和最大月购买量。
--每个产品的总购买量
 sum(o.quantity) over(partition by p.product_id) as total_quantity
--最大月购买量
max(quantity) over(partition by product_id) as max_monthly_quantity
  • 分区:通过 partition by 按产品编号分区,确保计算是在每个产品的范围内进行。
partition by product_id
  • 计算总销售额:通过 sum(o.quantity) * unit_price 计算。
sum(o.quantity) over(partition by p.product_id) * unit_price as total_sales
  • 计算平均月销售额:将总销售额除以 12,并使用 round() 函数保留两位小数。
round(sum(o.quantity) over(partition by p.product_id) * unit_price / 12.0, 2) as avg_monthly_sales

步骤一全部代码

with year_sales as (
    select 
    p.product_id,
    p.unit_price,
    sum(o.quantity) over(partition by p.product_id) * unit_price as total_sales,
    sum(o.quantity) over(partition by p.product_id) as total_quantity,
    round(sum(o.quantity) over(partition by p.product_id) * unit_price / 12.0, 2) as avg_monthly_sales,
    max(quantity) over(partition by product_id) as max_monthly_quantity,
    o.customer_id
    from orders o 
    join products p on p.product_id = o.product_id
)

步骤2:计算每个产品的客户购买量排名

接下来,我们需要计算每个产品的客户购买量,并对其进行排名。

  • 分组:使用 group by 计算每个客户对每个产品的总购买量。
select
    ...
    sum(quantity) as customer_sum
    ...
group by customer_id, product_id
  • 排名:使用 dense_rank() 窗口函数对客户购买量进行排名,按用户id分组,按购买量降序排列,年龄升序排列。
dense_rank() over(partition by o.product_id order by o.customer_sum desc, c.customer_age asc) as rk

步骤二全部代码

customers_group as (
    select 
    c.customer_id,
    o.product_id,
    c.customer_age,
    dense_rank() over(partition by o.product_id order by o.customer_sum desc, c.customer_age asc) as rk
    from (
        select
        customer_id,
        product_id,
        sum(quantity) as customer_sum
        from orders
        group by customer_id, product_id
    ) o 
    join customers c on c.customer_id = o.customer_id
    order by o.product_id
)

步骤3:选择每个产品的最大购买量客户及其年龄段

最后,我们选择每个产品的最大购买量客户,并根据其年龄段进行分组。

  • 条件判断:使用 case when 语句将客户年龄分组。
    case
        when c.customer_age between 1 and 10 then '1-10'
        ...
        when c.customer_age between 51 and 60 then '51-60'
        else '61+'
    end as customer_age_group
  • 连接:将 year_salescustomers_group 连接,选择排名第一的客户。
join year_sales ys on ys.product_id = c.product_id
where c.rk = 1

步骤三全部代码

select distinct 
    ys.product_id,
    ys.total_sales,
    ys.unit_price,
    ys.total_quantity,
    ys.avg_monthly_sales,
    ys.max_monthly_quantity,
    case 
        when c.customer_age between 1 and 10 then '1-10'
        when c.customer_age between 11 and 20 then '11-20'
        when c.customer_age between 21 and 30 then '21-30'
        when c.customer_age between 31 and 40 then '31-40'
        when c.customer_age between 41 and 50 then '41-50'
        when c.customer_age between 51 and 60 then '51-60'
        else '61+'
    end as customer_age_group
from customers_group c
join year_sales ys on ys.product_id = c.product_id
where c.rk = 1
order by ys.total_sales desc, ys.product_id asc

解题的全部代码

with year_sales as (
    select 
    p.product_id,
    p.unit_price,
    sum(o.quantity) over(partition by p.product_id) * unit_price as total_sales,
    sum(o.quantity) over(partition by p.product_id) as total_quantity,
    round(sum(o.quantity) over(partition by p.product_id) * unit_price / 12.0, 2) as avg_monthly_sales,
    max(quantity) over(partition by product_id) as max_monthly_quantity,
    o.customer_id
    from orders o 
    join products p on p.product_id = o.product_id
),
customers_group as (
    select 
    c.customer_id,
    o.product_id,
    c.customer_age,
    dense_rank() over(partition by o.product_id order by o.customer_sum desc, c.customer_age asc) as rk
    from (
        select
        customer_id,
        product_id,
        sum(quantity) as customer_sum
        from orders
        group by customer_id, product_id
    ) o 
    join customers c on c.customer_id = o.customer_id
    order by o.product_id
)
select distinct 
    ys.product_id,
    ys.total_sales,
    ys.unit_price,
    ys.total_quantity,
    ys.avg_monthly_sales,
    ys.max_monthly_quantity,
    case 
        when c.customer_age between 1 and 10 then '1-10'
        when c.customer_age between 11 and 20 then '11-20'
        when c.customer_age between 21 and 30 then '21-30'
        when c.customer_age between 31 and 40 then '31-40'
        when c.customer_age between 41 and 50 then '41-50'
        when c.customer_age between 51 and 60 then '51-60'
        else '61+'
    end as customer_age_group
from customers_group c
join year_sales ys on ys.product_id = c.product_id
where c.rk = 1
order by ys.total_sales desc, ys.product_id asc

近似题目练习推荐