200字范文 > sql聚合分组查询-聚合函数

sql聚合分组查询-聚合函数

时间：2019-02-24 02:28:54

相关推荐

sql聚合分组查询-聚合函数

文章目录

知识题目1常规解法1：解法2：解法3：解法4：解法５（利用专用窗口函数：rank、dense_rank、row_number）：题目2解1：用if条件函数解2：在聚合函数中添加条件语句解3：用case when...then...end题目3：解1：聚合函数求出平均值，作为表中的一个筛选条件解2：解1的另一种写法解3：用窗口函数直接求出avg，作为一列。然后score与avg比较解4：解3的另写（在部分字段前加上了表名)解5：错误写法：报错"Table 't' doesn't exist"题目4：解1：解2：子查询解3：基于解2

知识

聚合函数：max、min、count、avg、sum

分组函数：group by

聚合函数会自动忽略值为null的行聚合函数只能直接加到SELECT, HAVING, GROUP BY后面

题目1

/practice/a690f76a718242fd80757115d305be45?tpId=240&tqId=2180959&ru=/ta/sql-advanced&qru=/ta/sql-advanced/question-ranking

常规解法1：

select tag, difficulty, round( (sum(score)-max(score)-min(score)) / (count(score)-2) ,1) as clip_avg_scorefrom examination_info as t1inner joinexam_record as t2on t1.exam_id=t2.exam_idwheretag="SQL" and difficulty="hard";

解法2：

select tag, difficulty, round( (sum(score)-max(score)-min(score)) / (count(score)-2) ,1) as clip_avg_scorefrom examination_infojoin # 默认是inner joinexam_record using(exam_id)wheretag="SQL" and difficulty="hard";

注意：

using只能在相同字段名的判等使用。

解法3：

select tag, difficulty, round(avg(score), 1) as clip_avg_scorefrom exam_record as t1left join # inner join 也对 examination_info as t2 using(exam_id)wheretag="SQL" and difficulty="hard"and score not in (select max(score) from exam_record)and score not in (select min(score) from exam_record);

注意：

例如最后的score not in (select max(score) from exam_record)不能写为score is not max(score)是因为聚合函数只能直接加到SELECT, HAVING, GROUP BY后面，因此要写一个子查询，先把最大、最小的score查找出来。例如最后的score not in (select max(score) from exam_record)的表名只能是原表名，不能是t1。否则报错：“Table ‘t1’ doesn’t exist”。

解法4：

select tag, difficulty, round(avg(score), 1) as clip_avg_scorefrom exam_record as t1left join # inner join 也对 examination_info as t2 using(exam_id)wheretag="SQL" and difficulty="hard"and score < (select max(score) from exam_record) and score > (select min(score) from exam_record)

解法５（利用专用窗口函数：rank、dense_rank、row_number）：

select tag, difficulty, round(avg(score), 1) as clip_avg_scorefrom(select-- exam_record.id ,exam_id, tag, difficulty, score, exam_id, tag, difficulty, score, # 这儿的注意点见下面dense_rank() over(order by score asc) as score_asc_ranking,dense_rank() over(order by score desc) as score_desc_rankingfrom exam_record left join examination_info using(exam_id)where tag="SQL" and difficulty="hard" and score is not null) AS twhere t.score_asc_ranking!=1 and t.score_desc_ranking!=1group bytag;

注意：

exam_id, tag, difficulty, score,

1.若写为*，会报错：“Duplicate column name ‘id’”，因为id列是二表都有的，而虽然exam_id也是二表都有的，但它是连结字段，因此可以直接写exam_id。

2.若写为id ,exam_id, tag, difficulty, score,，会报错：“Column ‘id’ in field list is ambiguous”，跟上面1.的报错原理其实一样。

3.若写为exam_record.id ,exam_id, tag, difficulty, score,，即将id指定表->exam_record.id，就OK。

题目2

/practice/45a87639110841b6950ef6a12d5f?tpId=240&tags=&title=&difficulty=0&judgeStatus=0&rp=0

解1：用if条件函数

SELECTcount(start_time) as total_pv,count(submit_time) as complete_pv,count(distinct if(submit_time is not null, exam_id, NULL)) as complete_exam_cntFROMexam_record;

解2：在聚合函数中添加条件语句

SELECTcount(start_time) as total_pv,count(submit_time) as complete_pv,count(DISTINCT exam_id and score IS NOT NULL) as complete_exam_cnt,FROMexam_record;

要注意count(DISTINCT exam_id and score IS NOT NULL)，聚合函数中也可以添加条件语句.

解3：用case when…then…end

SELECTcount(start_time) as total_pv,count(submit_time) as complete_pv,#count(distinct case when submit_time is not null then exam_id else null end ) as complete_exam_cntcount(distinct case when score is not null then exam_id else null end ) as complete_exam_cntFROMexam_record;

题目3：

/practice/3de23f1204694e74b7deef08922805b2?tpId=240&tags=&title=&difficulty=0&judgeStatus=0&rp=0

解1：聚合函数求出平均值，作为表中的一个筛选条件

select min(score) as min_score_over_avgFROM exam_record as erjoinexamination_info as eion er.exam_id=ei.exam_idwhere tag="SQL" andscore>=(select avg(score) from exam_record er join examination_info ei on er.exam_id=ei.exam_idwhere tag='SQL')

解2：解1的另一种写法

select min(score) as min_score_over_avgFROM exam_record joinexamination_info using(exam_id)where tag="SQL" andscore>=(select avg(score) from exam_record er join examination_info ei on er.exam_id=ei.exam_idwhere tag='SQL')

解3：用窗口函数直接求出avg，作为一列。然后score与avg比较

selectmin(t.score) as min_score_over_avg from(selectscore,avg(score) over() as avg_score from #exam_record #inner join # left#examination_info #using(exam_id)exam_record as erleft joinexamination_info as eion er.exam_id=ei.exam_idwheretag="SQL" andscore is not null) as twhere t.score>=t.avg_score;

解4：解3的另写（在部分字段前加上了表名)

selectmin(t.score) as min_score_over_avg from(selecter.score,avg(er.score) over() as avg_score from exam_record as erleft joinexamination_info as eion er.exam_id=ei.exam_idwhereei.tag="SQL" ander.score is not null) as twhere t.score>=t.avg_score;

解5：错误写法：报错"Table ‘t’ doesn’t exist"

目前咱不知道为啥报错。。。

select min(t.score) as min_score_over_avgFROM (select scorefrom exam_record inner join # inner/left join examination_info using(exam_id) where tag="SQL" and score is not null) as twhere t.score>=(select avg(t.score) from t);

题目4：

/practice/9e2fb674b58b4f60ac765b7a37dde1b9?tpId=240&tqId=2183005&ru=/practice/3de23f1204694e74b7deef08922805b2&qru=/ta/sql-advanced/question-ranking

解1：

SELECTDATE_FORMAT(submit_time, "%Y%m") as month,round((count(distinct uid, DATE_FORMAT(submit_time, "%y%m%d"))) / count(distinct uid), 2) as avg_active_days,COUNT(distinct uid) as mauFROMexam_recordWHEREsubmit_time is not NULLand year(submit_time)=GROUP BYDATE_FORMAT(submit_time, "%Y%m")#month; 也ok

解2：子查询

selectSUBSTR(ymd, 1, 6) as month,round(count(1) / count(distinct uid), 2) as avg_active_days,count(distinct uid) as mauFROM(SELECT DISTINCT uid, DATE_FORMAT(submit_time, "%Y%m%d") as ymdfrom exam_recordwhere submit_time is not NULL and YEAR(submit_time)= # ""也OK) as tGROUP BYSUBSTR(ymd, 1, 6);#month; 也ok

注意：

count(1)就相当于count(*)，对总行数计数,其效率更高。FROM(…) as t，必须有as t，否则报错： ‘Every derived table must have its own alias’–“每个派生表必须有自己的别名”。

解3：基于解2

selectt.Ym as month,round(count(*) / count(distinct t.uid), 2) as avg_active_days,count(distinct t.uid) as mauFROM(SELECT DISTINCT uid, DATE_FORMAT(submit_time, "%Y%m") as Ym,DATE_FORMAT(submit_time, "%Y%m%d") as ymdfrom exam_recordwhere submit_time is not NULL and YEAR(submit_time)= # ""也OK) as tGROUP BYt.Ym;

或：

selectYm as month,round(count(*) / count(distinct uid), 2) as avg_active_days,count(distinct uid) as mauFROM(SELECT DISTINCT uid, DATE_FORMAT(submit_time, "%Y%m") as Ym,DATE_FORMAT(submit_time, "%Y%m%d") as ymdfrom exam_recordwhere submit_time is not NULL and YEAR(submit_time)= # ""也OK) as tGROUP BYYm;

本内容不代表本网观点和政治立场，如有侵犯你的权益请联系我们处理。

网友评论

网友评论仅供其表达个人看法，并不表明网站立场。