题解 | 贷款情况

with sub1 as(
    select b.city as city,
    round(sum(a.loan_amount),2) as total_loan_amount,
    round(sum(a.loan_amount)/count(distinct b.customer_id),2) as average_loan_amount,
    count(distinct b.customer_id) as total_customers
    from loan_applications a join customers b using(customer_id)
    group by b.city),
    sub2 as(
    select city,loan_type_name as most_applied_loan_type
    from(
    select b.city as city,
    c.loan_type_name as loan_type_name,
    row_number()over(partition by b.city order by count(c.loan_type_name) desc,c.loan_type_id) as rk
    from loan_applications a join customers b using(customer_id)
    join loan_application_types d using(application_id)
    join loan_types c using(loan_type_id)
    group by b.city,c.loan_type_id,c.loan_type_name) as sub3
    where sub3.rk=1)
    select sub1.city,total_loan_amount,average_loan_amount,total_customers,most_applied_loan_type
    from sub1 join sub2 using(city)
    order by sub1.city

存在误区：

1、客户数量，务必customer_id去重求值；

2、贷款均值=贷款总额/客户数量，而非“每笔贷款的均值”；

3、每个城市最常见的贷款类型，partition by city 是根据城市分区，order by count(c.loan_type_name)再计算该城市下，每个贷款类型数量，进行排序；partition by city，c.loan_type_name是直接根据城市且贷款类型组合分区，即使排序后，每个是单独一个序号，都为1——这种是错误的。

4、分组颗粒度，group by b.city,c.loan_type_id,c.loan_type_name和group by b.city,c.loan_type_name区别：如果c.loan_type_name一致，后者其实两个不同的类型就会划到一组，所以最好加上“类型id”