最近做了阿里的24年春招题,题目如下:
题目分析
在这道题目中,我们需要处理三个表,表中会用到的信息包括:
-
orders(订单表)
order_id
:订单编号product_id
:产品编号customer_id
:客户编号quantity
:购买数量
-
products(产品表)
product_id
:产品编号unit_price
:单价
-
customers(客户表)
customer_id
:客户编号customer_age
:客户年龄
目标:
- 计算每个产品2023年的总销售额、平均月销售额、单月最高销量、最大月销售量,并找出每个产品的最大购买量的客户的年龄段。
输出要求:
- 按照每个产品的总销售额降序排列。
- 如果总销售额一致,则按照产品的ID升序排列。
- 当存在两个客户年购买量都是最高时,customer_age_group展示年龄小的顾客的年龄段。
知识点关键词:窗口函数、分组、连接、条件判断
解答步骤
步骤1:计算每个产品的年度销售数据
我们首先需要计算每个产品的总销售额、总购买量、平均月销售额和最大月购买量。
- 窗口函数:使用
sum()
和max()
窗口函数来计算每个产品的总购买量和最大月购买量。
--每个产品的总购买量
sum(o.quantity) over(partition by p.product_id) as total_quantity
--最大月购买量
max(quantity) over(partition by product_id) as max_monthly_quantity
- 分区:通过
partition by
按产品编号分区,确保计算是在每个产品的范围内进行。
partition by product_id
- 计算总销售额:通过
sum(o.quantity) * unit_price
计算。
sum(o.quantity) over(partition by p.product_id) * unit_price as total_sales
- 计算平均月销售额:将总销售额除以 12,并使用
round()
函数保留两位小数。
round(sum(o.quantity) over(partition by p.product_id) * unit_price / 12.0, 2) as avg_monthly_sales
步骤一全部代码:
with year_sales as (
select
p.product_id,
p.unit_price,
sum(o.quantity) over(partition by p.product_id) * unit_price as total_sales,
sum(o.quantity) over(partition by p.product_id) as total_quantity,
round(sum(o.quantity) over(partition by p.product_id) * unit_price / 12.0, 2) as avg_monthly_sales,
max(quantity) over(partition by product_id) as max_monthly_quantity,
o.customer_id
from orders o
join products p on p.product_id = o.product_id
)
步骤2:计算每个产品的客户购买量排名
接下来,我们需要计算每个产品的客户购买量,并对其进行排名。
- 分组:使用
group by
计算每个客户对每个产品的总购买量。
select
...
sum(quantity) as customer_sum
...
group by customer_id, product_id
- 排名:使用
dense_rank()
窗口函数对客户购买量进行排名,按用户id分组,按购买量降序排列,年龄升序排列。
dense_rank() over(partition by o.product_id order by o.customer_sum desc, c.customer_age asc) as rk
步骤二全部代码:
customers_group as (
select
c.customer_id,
o.product_id,
c.customer_age,
dense_rank() over(partition by o.product_id order by o.customer_sum desc, c.customer_age asc) as rk
from (
select
customer_id,
product_id,
sum(quantity) as customer_sum
from orders
group by customer_id, product_id
) o
join customers c on c.customer_id = o.customer_id
order by o.product_id
)
步骤3:选择每个产品的最大购买量客户及其年龄段
最后,我们选择每个产品的最大购买量客户,并根据其年龄段进行分组。
- 条件判断:使用
case when
语句将客户年龄分组。
case
when c.customer_age between 1 and 10 then '1-10'
...
when c.customer_age between 51 and 60 then '51-60'
else '61+'
end as customer_age_group
- 连接:将
year_sales
和customers_group
连接,选择排名第一的客户。
join year_sales ys on ys.product_id = c.product_id
where c.rk = 1
步骤三全部代码:
select distinct
ys.product_id,
ys.total_sales,
ys.unit_price,
ys.total_quantity,
ys.avg_monthly_sales,
ys.max_monthly_quantity,
case
when c.customer_age between 1 and 10 then '1-10'
when c.customer_age between 11 and 20 then '11-20'
when c.customer_age between 21 and 30 then '21-30'
when c.customer_age between 31 and 40 then '31-40'
when c.customer_age between 41 and 50 then '41-50'
when c.customer_age between 51 and 60 then '51-60'
else '61+'
end as customer_age_group
from customers_group c
join year_sales ys on ys.product_id = c.product_id
where c.rk = 1
order by ys.total_sales desc, ys.product_id asc
解题的全部代码
with year_sales as (
select
p.product_id,
p.unit_price,
sum(o.quantity) over(partition by p.product_id) * unit_price as total_sales,
sum(o.quantity) over(partition by p.product_id) as total_quantity,
round(sum(o.quantity) over(partition by p.product_id) * unit_price / 12.0, 2) as avg_monthly_sales,
max(quantity) over(partition by product_id) as max_monthly_quantity,
o.customer_id
from orders o
join products p on p.product_id = o.product_id
),
customers_group as (
select
c.customer_id,
o.product_id,
c.customer_age,
dense_rank() over(partition by o.product_id order by o.customer_sum desc, c.customer_age asc) as rk
from (
select
customer_id,
product_id,
sum(quantity) as customer_sum
from orders
group by customer_id, product_id
) o
join customers c on c.customer_id = o.customer_id
order by o.product_id
)
select distinct
ys.product_id,
ys.total_sales,
ys.unit_price,
ys.total_quantity,
ys.avg_monthly_sales,
ys.max_monthly_quantity,
case
when c.customer_age between 1 and 10 then '1-10'
when c.customer_age between 11 and 20 then '11-20'
when c.customer_age between 21 and 30 then '21-30'
when c.customer_age between 31 and 40 then '31-40'
when c.customer_age between 41 and 50 then '41-50'
when c.customer_age between 51 and 60 then '51-60'
else '61+'
end as customer_age_group
from customers_group c
join year_sales ys on ys.product_id = c.product_id
where c.rk = 1
order by ys.total_sales desc, ys.product_id asc
近似题目练习推荐
-
牛客每个人最近的登录日期(一)
- 题目链接:牛客每个人最近的登录日期(一)
- 知识点:
ROW_NUMBER()
窗口函数、PARTITION BY
、ORDER BY
、JOIN
。
-
牛客每个人最近的登录日期(二)
- 题目链接:牛客每个人最近的登录日期(二)
- 知识点:
ROW_NUMBER()
窗口函数、PARTITION BY
、ORDER BY
、JOIN
。
-
牛客每个人最近的登录日期(三)
- 题目链接:牛客每个人最近的登录日期(三)
- 知识点:
CASE
函数、MAX
函数、GROUP BY
。
-
获取有奖金的员工相关信息
- 题目链接:获取有奖金的员工相关信息
- 知识点:
CASE
函数、WHERE
条件过滤、ORDER BY
。
-
更新员工信息表
- 题目链接:更新员工信息表
- 知识点:窗口函数、分组、排序、连接操作