Oracle和MySQL分组查询GROUP BY
真题1、Oracle和MySQL中的分组(GROUP BY)有什么区别?
答案:Oracle对于GROUP BY是严格的,所有要SELECT出来的字段必须在GROUP BY后边出现,否则会报错:“ORA-00979: not a GROUP BY expression”。而MySQL则不同,如果SELECT出来的字段在GROUP BY后面没有出现,那么会随机取出一个值,而这样查询出来的数据不准确,语义也不明确。所以,作者建议在写SQL语句的时候,应该给数据库一个非常明确的指令,而不是让数据库去猜测,这也是写SQL语句的一个非常良好的习惯。
下面给出一个示例。有一张T_MAX_LHR表,数据如下图所示,有3个字段ARTICLE、AUTHOR和PRICE。请选出每个AUTHOR的PRICE最高的记录(要包含所有字段)。
ARTICLE |
AUTHOR |
PRICE |
0001 |
B |
3.99 |
0002 |
A |
10.99 |
0003 |
C |
1.69 |
0004 |
B |
19.95 |
0005 |
A |
6.96 |
CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR2(30),AUTHOR VARCHAR2(30),PRICE NUMBER); --Oracle
--CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR(30),AUTHOR VARCHAR(30),PRICE FLOAT); --MySQL oracle通用
INSERT INTO T_MAX_LHR VALUES ('0001','B',3.99);
INSERT INTO T_MAX_LHR VALUES ('0002','A',10.99);
INSERT INTO T_MAX_LHR VALUES ('0003','C',1.69);
INSERT INTO T_MAX_LHR VALUES ('0004','B',19.95);
INSERT INTO T_MAX_LHR VALUES ('0005','A',6.96);
COMMIT;
SELECT * FROM T_MAX_LHR;
在Oracle中的数据:
LHR@orclasm > SELECT * FROM T_MAX_LHR;
ARTICLE AUTHOR PRICE
-------- -------- ----------
0001 B 3.99
0002 A 10.99
0003 C 1.69
0004 B 19.95
0005 A 6.96
在MySQL中的数据:
mysql> SELECT * FROM T_MAX_LHR;
+---------+--------+-------+
| ARTICLE | AUTHOR | PRICE |
+---------+--------+-------+
| 0001 | B | 3.99 |
| 0002 | A | 10.99 |
| 0003 | C | 1.69 |
| 0004 | B | 19.95 |
| 0005 | A | 6.96 |
+---------+--------+-------+
5 rows in set (0.00 sec)
分析数据后,正确答案应该是:
ARTICLE |
AUTHOR |
PRICE |
0002 |
A |
10.99 |
0003 |
C |
1.69 |
0004 |
B |
19.95 |
SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
在Oracle中执行上面的SQL语句报错:
LHR@orclasm > SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
LHR@orclasm > SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR
*
ERROR at line 1:
ORA-00979: not a GROUP BY expression
在MySQL中执行同样的SQL语句不会报错:
mysql> select version();
+-------------------------------------------+
| version() |
+-------------------------------------------+
| 5.6.21-enterprise-commercial-advanced-log |
+-------------------------------------------+
mysql> SELECT T.ARTICLE,T.AUTHOR,MAX(T.PRICE) FROM T_MAX_LHR T GROUP BY T.AUTHOR;
+---------+--------+--------------+
| ARTICLE | AUTHOR | MAX(T.PRICE) |
+---------+--------+--------------+
| 0002 | A | 10.99 |
| 0001 | B | 19.95 |
| 0003 | C | 1.69 |
+---------+--------+--------------+
3 rows in set (0.00 sec)
mysql> SELECT * FROM T_MAX_LHR T GROUP BY T.AUTHOR;
+---------+--------+-------+
| ARTICLE | AUTHOR | PRICE |
+---------+--------+-------+
| 0002 | A | 10.99 |
| 0001 | B | 3.99 |
| 0003 | C | 1.69 |
+---------+--------+-------+
3 rows in set (0.00 sec)
虽然执行不报错,可以查询出数据,但是从结果来看数据并不是最终想要的结果,甚至数据是错乱的。下面给出几种正确的写法(在Oracle和MySQL中均可执行):
(1)使用相关子查询
SELECT *
FROM T_MAX_LHR T
WHERE (T.AUTHOR, T.PRICE) IN (SELECT NT.AUTHOR, MAX(NT.PRICE) PRICE
FROM T_MAX_LHR NT
GROUP BY NT.AUTHOR)
ORDER BY T.ARTICLE;
SELECT *
FROM T_MAX_LHR T
WHERE T.PRICE = (SELECT MAX(NT.PRICE) PRICE
FROM T_MAX_LHR NT
WHERE T.AUTHOR = NT.AUTHOR)
ORDER BY T.ARTICLE;
(2)使用非相关子查询
SELECT T.*
FROM T_MAX_LHR T
JOIN (SELECT NT.AUTHOR, MAX(NT.PRICE) PRICE
FROM T_MAX_LHR NT
GROUP BY NT.AUTHOR) T1
ON T.AUTHOR = T1.AUTHOR
AND T.PRICE = T1.PRICE
ORDER BY T.ARTICLE;
(3)使用LEFT JOIN语句
SELECT T.*
FROM T_MAX_LHR T
LEFT OUTER JOIN T_MAX_LHR T1
ON T.AUTHOR = T1.AUTHOR
AND T.PRICE < T1.PRICE
WHERE T1.ARTICLE IS NULL
ORDER BY T.ARTICLE;
在Oracle中的执行结果:
LHR@orclasm > SELECT T.*
2 FROM T_MAX_LHR T
3 LEFT OUTER JOIN T_MAX_LHR T1
4 ON T.AUTHOR = T1.AUTHOR
5 AND T.PRICE < T1.PRICE
6 WHERE T1.ARTICLE IS NULL
7 ORDER BY T.ARTICLE;
ARTICLE AUTHOR PRICE
-------- -------- ----------
0002 A 10.99
0003 C 1.69
0004 B 19.95
在MySQL中的执行结果:
mysql> SELECT T.*
-> FROM T_MAX_LHR T
-> LEFT OUTER JOIN T_MAX_LHR T1
-> ON T.AUTHOR = T1.AUTHOR
-> AND T.PRICE < T1.PRICE
-> WHERE T1.ARTICLE IS NULL
-> ORDER BY T.ARTICLE;
+---------+--------+-------+
| ARTICLE | AUTHOR | PRICE |
+---------+--------+-------+
| 0002 | A | 10.99 |
| 0003 | C | 1.69 |
| 0004 | B | 19.95 |
+---------+--------+-------+
3 rows in set (0.00 sec)
真题2、Oracle和MySQL中的分组(GROUP BY)后的聚合函数分别是什么?
答案:在Oracle中,可以用WM_CONCAT函数或LISTAGG分析函数;在MySQL中可以使用GROUP_CONCAT函数。示例如下:
首先给出建表语句:
CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR2(30),AUTHOR VARCHAR2(30),PRICE NUMBER); --Oracle
--CREATE TABLE T_MAX_LHR (ARTICLE VARCHAR(30),AUTHOR VARCHAR(30),PRICE FLOAT); --MySQL oracle通用
INSERT INTO T_MAX_LHR VALUES ('0001','B',3.99);
INSERT INTO T_MAX_LHR VALUES ('0002','A',10.99);
INSERT INTO T_MAX_LHR VALUES ('0003','C',1.69);
INSERT INTO T_MAX_LHR VALUES ('0004','B',19.95);
INSERT INTO T_MAX_LHR VALUES ('0005','A',6.96);
COMMIT;
SELECT * FROM T_MAX_LHR;
在MySQL中:
mysql> SELECT T.AUTHOR, GROUP_CONCAT(T.ARTICLE), GROUP_CONCAT(T.PRICE)
-> FROM T_MAX_LHR T
-> GROUP BY T.AUTHOR;
+--------+-------------------------+-----------------------+
| AUTHOR | GROUP_CONCAT(T.ARTICLE) | GROUP_CONCAT(T.PRICE) |
+--------+-------------------------+-----------------------+
| A | 0002,0005 | 10.99,6.96 |
| B | 0001,0004 | 3.99,19.95 |
| C | 0003 | 1.69 |
+--------+-------------------------+-----------------------+
3 rows in set (0.00 sec)
在Oracle中:
LHR@orclasm > SELECT T.AUTHOR, WM_CONCAT(T.ARTICLE) ARTICLE, WM_CONCAT(T.PRICE) PRICE
2 FROM T_MAX_LHR T
3 GROUP BY T.AUTHOR;
AUTHOR ARTICLE PRICE
-------- --------------- ---------------
A 0002,0005 10.99,6.96
B 0001,0004 3.99,19.95
C 0003 1.69
LHR@orclasm > SELECT T.AUTHOR,
2 LISTAGG(T.ARTICLE, ',') WITHIN GROUP(ORDER BY T.PRICE) ARTICLE,
3 LISTAGG(T.PRICE, ',') WITHIN GROUP(ORDER BY T.PRICE) PRICE
4 FROM T_MAX_LHR T
5 GROUP BY T.AUTHOR;
AUTHOR ARTICLE PRICE
-------- --------------- ---------------
A 0005,0002 6.96,10.99
B 0001,0004 3.99,19.95
C 0003 1.69
原作者不知道是谁了,这个图不是小麦苗画的。
MySQL分组查询group by使用示例
(1) group by的含义:将查询结果按照1个或多个字段进行分组,字段值相同的为一组
(2) group by可用于单个字段分组,也可用于多个字段分组
select * from employee;
+------+------+--------+------+------+-------------+
| num | d_id | name | age | sex | homeaddr |
+------+------+--------+------+------+-------------+
| 1 | 1001 | 张三 | 26 | 男 | beijinghdq |
| 2 | 1002 | 李四 | 24 | 女 | beijingcpq |
| 3 | 1003 | 王五 | 25 | 男 | changshaylq |
| 4 | 1004 | Aric | 15 | 男 | England |
+------+------+--------+------+------+-------------+
select * from employee group by d_id,sex;
select * from employee group by sex;
+------+------+--------+------+------+------------+
| num | d_id | name | age | sex | homeaddr
|+------+------+--------+------+------+------------+
| 2 | 1002 | 李四 | 24 | 女 | beijingcpq |
| 1 | 1001 | 张三 | 26 | 男 | beijinghdq |
+------+------+--------+------+------+------------+
根据sex字段来分组,sex字段的全部值只有两个('男'和'女'),所以分为了两组 当group by单独使用时,只显示出每组的第一条记录 所以group by单独使用时的实际意义不大
group by + group_concat()
(1) group_concat(字段名)可以作为一个输出字段来使用,
(2) 表示分组之后,根据分组结果,使用group_concat()来放置每一组的某字段的值的集合
select sex from employee group by sex;
+------+
| sex |
+------+
| 女 |
| 男 |
+------+
select sex,group_concat(name) from employee group by sex;
+------+--------------------+
| sex | group_concat(name) |
+------+--------------------+
| 女 | 李四 |
| 男 | 张三,王五,Aric |
+------+--------------------+
select sex,group_concat(d_id) from employee group by sex;
+------+--------------------+
| sex | group_concat(d_id) |
+------+--------------------+
| 女 | 1002 |
| 男 | 1001,1003,1004 |
+------+--------------------+
group by + 集合函数
(1) 通过group_concat()的启发,我们既然可以统计出每个分组的某字段的值的集合,那么我们也可以通过集合函数来对这个"值的集合"做一些操作
select sex,group_concat(age) from employee group by sex;
+------+-------------------+
| sex | group_concat(age) |
+------+-------------------+
| 女 | 24 |
| 男 | 26,25,15 |
+------+-------------------+
分别统计性别为男/女的人年龄平均值
select sex,avg(age) from employee group by sex;
+------+----------+
| sex | avg(age) |
+------+----------+
| 女 | 24.0000 |
| 男 | 22.0000 |
+------+----------+
分别统计性别为男/女的人的个数
select sex,count(sex) from employee group by sex;
+------+------------+
| sex | count(sex) |
+------+------------+
| 女 | 1 |
| 男 | 3 |
+------+------------+
group by + having
(1) having 条件表达式:用来分组查询后指定一些条件来输出查询结果
(2) having作用和where一样,但having只能用于group by
select sex,count(sex) from employee group by sex having count(sex)>2;
+------+------------+
| sex | count(sex) |
+------+------------+
| 男 | 3 |
+------+------------+
group by + with rollup
(1) with rollup的作用是:在最后新增一行,来记录当前列里所有记录的总和
select sex,count(age) from employee group by sex with rollup;
+------+------------+