select city ,total_loan_amount ,average_loan_amount ,total_customers ,loan_type_name as most_applied_loan_type from ( select city ,round(sum(loan_amount),2) as total_loan_amount ,round(sum(loan_amount) / count(distinct customer_name),2) as average_loan_amount ,round(count(distinct customer_id),2) as total_customers from loan_applications join customers using(customer_id) group by city ) as t1 left join ( select city ,loan_type_id ,loan_type_name ,row_number()over(partition by city order by count(loan_type_name) DESC,loan_type_id ASC ) as rk from loan_types join loan_application_types using(loan_type_id) join loan_applications using(application_id) join customers using(customer_id) group by city,loan_type_id,loan_type_name ) as t2 using(city) where rk = 1;
这题一眼就该看出
{ city: 城市名称; total_loan_amount: 该城市所有客户的贷款申请总金额,保留小数点后2位; average_loan_amount: 该城市所有客户的平均每个人的贷款申请金额,保留小数点后2位; total_customers: 该城市的客户数量 }
这些个字段都是可以通过连接loan_applications和customers后 聚合city计算 就能立刻找到;
而“该城市最常申请的贷款类型名称”这个字段的聚合筛选条件明显与其他字段不同,所以单开一个子查询去求。毫无疑问,它需要连接所有四张表,先进行排名,而后对排名筛选得到结果 聚合条件为 city,loan_type_id,loan_type_name
这里需要注意的难点是:根据题意,排名条件应为:row_number()over(partition by city order by count(loan_type_name) DESC,loan_type_id ASC ) 而要想loan_type_id也作为排名条件,就必须在前面的select 字段中也加上loan_type_id,不然你group by 之后是找不到这个条件的