Oracle高级查询-阿里云开发者社区

概述

高级查询在数据库的开发过程中应用广泛，本博文将从分组查询、多表查询和子查询三个方面介绍Oracle的高级查询，最后典型案例的应用。

sql窗口修改已执行的sql

ed表示编辑上个sql

/ 表示执行上个sql

分组查询

分组函数的概念

分组函数作用于一组数据，并对一组数据返回一个值。

分组函数的语法

常用分组函数

avg
sum
min
max
count
wm_contact 行转列

更多及用法请参考oracle函数

常见分组函数的使用

avg()/sum()

求出员工的平均工资和工资总和。

SQL> select avg(sal) , sum(sal) from emp ;
  AVG(SAL)   SUM(SAL)
---------- ----------
2073.21428      29025

min()/max()

SQL> select min(sal), max(sal) from emp;
  MIN(SAL)   MAX(SAL)
---------- ----------
       800       5000
SQL>

count()

SQL> select count(1) from emp;
  COUNT(1)
----------
        14

distinct 关键字

SQL> select distinct(deptno)  from emp;
DEPTNO
------
    30
    20
    10

wm_concat()行转列

SQL> select deptno   部门 ,wm_concat(ename) 部门总的员工 from emp  group by deptno;
  部门 部门总的员工
---- --------------------------------------------------------------------------------
  10 CLARK,MILLER,KING
  20 SMITH,FORD,ADAMS,SCOTT,JONES
  30 ALLEN,JAMES,TURNER,BLAKE,MARTIN,WARD

wm_concat不同版本的区别

10.2.0.4以前，wm_concat返回的是varchar2，10.2.0.5开始，是CLOB.

nvl()/nvl2()

分组函数会自动忽略空值, nvl()函数可以使分组函数不忽略空值

NVL (expr1, expr2)

【功能】若expr1为NULL，返回expr2；expr1不为NULL，返回expr1。

NVL2 (expr1, expr2, expr3)

【功能】expr1不为NULL，返回expr2；expr2为NULL，返回expr3。

expr2和expr3类型不同的话，expr3会转换为expr2的类型

group by

语法

在select列表中所有未包含在函数中的列都应该包含在group by子句中,否则会抛出 ora-00937 not a singel-group group function。

select  a, b, c ,avg(d)
from table_name 
group by a, b ,c ;

包含在group by子句中的列，不必包含在select列表中。

select avg(sal) 
from emp 
group by deptno;

使用多个列分组

按照部门、不同的职位，统计员工的工资总和。

select deptno, job, sum(sal) from emp group by deptno, job order by deptno;

先按照deptno分组，再按照job分组，如果都一样，则是同一组数据。

过滤分组-having子句的使用以及和where的区别

having子句语法

having子句和where的区别

where子句中不能使用组函数，having后可以使用；
当where和having可以通用的情况下，优先使用where，效率更高
where 先过滤后分组
having 先分组后过滤
优先使用where

举例：

在分组函数中使用order by

select deptno , avg(sal) 
from emp 
group by deptno
order by avg(sal) ;--按表达式排序

select deptno , avg(sal) 平均工资 
from emp 
group by deptno
order by 平均工资  ;--按别名排序

select deptno , avg(sal) 平均工资 
from emp 
group by deptno
order by 2  ; --按序号排序，表示第二列。 如果只有2列，不能出现比2大的值

分组函数的嵌套

栗子：求部门平均工资的最大值

先求出部门的平均工资
再求出平均工资中的最大值

select max(avg(sal)) from emp group by deptno;

包含在group by子句中的列，不必包含在select列表中。

group by语句的增强

分析一下这个报表

第一个红框内的是  按照部门和职位统计 工作总和
select deptno,job,sum(sal) from emp group by deptno,job order by deptno;
第二个小篮筐是 部门工资的总和
select deptno ,sum(sal) from  emp group by deptno order by deptno;
第三个总计是 工资总和 
select sum(sal) from emp ;

整合一下：

select  * from (
select deptno , job, sum(sal) from emp group by deptno ,  job
union all 
select deptno ,null ,sum(sal) from  emp group by deptno
union all 
select  null ,null ,sum(sal) from emp  )  order by deptno, job ;

我们可以通过oracle提供的rollup函数来简化书写的过程。

select deptno, job, sum(sal) from emp group by rollup(deptno, job);

可以得到同样的结果

理解rollup

rollup官方文档

select a, b, c, sum( d )
from t
group by rollup(a, b, c);

等效于

select * from (
select a, b, c, sum( d ) from t group by a, b, c 
union all
select a, b, null, sum( d ) from t group by a, b
union all
select a, null, null, sum( d ) from t group by a
union all
select null, null, null, sum( d ) from t
)

引申 GROUPING SETS 、 CUBE 、GROUPING

数据

create table students
(id number(15,0),
area varchar2(10),
stu_type varchar2(2),
score number(20,2));
insert into students values(1, '111', 'g', 80 );
insert into students values(1, '111', 'j', 80 );
insert into students values(1, '222', 'g', 89 );
insert into students values(1, '222', 'g', 68 );
insert into students values(2, '111', 'g', 80 );
insert into students values(2, '111', 'j', 70 );
insert into students values(2, '222', 'g', 60 );
insert into students values(2, '222', 'j', 65 );
insert into students values(3, '111', 'g', 75 );
insert into students values(3, '111', 'j', 58 );
insert into students values(3, '222', 'g', 58 );
insert into students values(3, '222', 'j', 90 );
insert into students values(4, '111', 'g', 89 );
insert into students values(4, '111', 'j', 90 );
insert into students values(4, '222', 'g', 90 );
insert into students values(4, '222', 'j', 89 );
commit;

GROUPING SETS

GROUPING SETS官方文档

select id,area,stu_type,sum(score) score 
from students
group by grouping sets((id,area,stu_type),(id,area),id)
order by id,area,stu_type;

理解grouping sets

select a, b, c, sum( d ) from t
group by grouping sets ( a, b, c )

等效于

select * from (
select a, null, null, sum( d ) from t group by a
union all
select null, b, null, sum( d ) from t group by b 
union all
select null, null, c, sum( d ) from t group by c 
)

CUBE

CUBE 官方文档

select id,area,stu_type,sum(score) score 
from students
group by cube(id,area,stu_type)
order by id,area,stu_type;

理解cube

select a, b, c, sum( d ) from t
group by cube( a, b, c)

等效于

select a, b, c, sum( d ) from t
group by grouping sets( 
( a, b, c ), 
( a, b ), ( a ), ( b, c ), 
( b ), ( a, c ), ( c ), 
() )

GROUPING

GROUPING官方文档

从上面的结果中我们很容易发现,每个统计数据所对应的行都会出现null。

如何来区分到底是根据那个字段做的汇总呢,grouping函数判断是否合计列!

select decode(grouping(id),1,'all id',id) id,
decode(grouping(area),1,'all area',to_char(area)) area,
decode(grouping(stu_type),1,'all_stu_type',stu_type) stu_type,
sum(score) score
from students
group by cube(id,area,stu_type)
order by id,area,stu_type;

多表连接

多表查询

笛卡尔积

列等于两个表列数的和
行等于两个表行数的乘积

笛卡尔积中并不是全部正确的数据，要根据连接条件进行筛选。

比如刚才的dept和emp, 满足连接条件emp.deptno=dept.deptno才是正确的数据。

在实际运行环境下，应避免使用笛卡儿积全集。

连接条件至少有n-1个。

ORACLE表的四种连接方式

等值连接

通过两个表具有相同意义的列，可以建立相等连接条件。

只有连接列上在两个表中都出现且值相等的行才会出现在查询结果中。

select a.empno, a.ename, a.sal, b.dname
  from emp a, dept b
 where a.deptno = b.deptno;

不等值连接

两个表中的相关的两列进行不等连接,

比较符号一般为>,<,…,between… and…(小值在前大值灾后)

select e.empno, e.ename, e.sal, s.grade
  from emp e, salgrade s
 where e.sal between s.losal and s.hisal;

外连接

对于外连接，Oracle中可以使用“(+)”来表示，还可以使用LEFT/RIGHT/FULL OUTER JOIN 。

外连接就是为了解决：通过外链接，把对于连接条件上不成立的记录，仍然包含在最后的结果中.

右外连接

A) 左条件(+) = 右条件;

　　代表除了显示匹配相等连接条件的信息之外,还显示右条件所在的表中无法匹配相等连接条件的信息。

此时也称为"右外连接".另一种表示方法是:

SELECT ... FROM 表1 RIGHT OUTER JOIN 表2 ON 连接条件

出现在表2中的字段，如果表1不存在该值，依然输出

左外连接

B) 左条件 = 右条件(+);

　　代表除了显示匹配相等连接条件的信息之外,还显示左条件所在的表中无法匹配相等连接条件的信息。

　　此时也称为"左外连接"

SELECT ... FROM 表1 LEFT OUTER JOIN 表2 ON 连接条件

存在表1的数据，如果表2不存在，依然输出

数据说明：

部门表有个id为40的部门。而员工表中却没有deptno=40的员工。

select   p.deptno  部门号, p.dname 部门名称, count(e.empno) 部门人数
from emp e  ,dept p
where e.deptno (+)= p.deptno --右外连接 ， 显示右边的表 不能匹配的信息   注意等号两侧表的顺序
group by   p.deptno , p.dname
order by p.deptno;
select   p.deptno  部门号, p.dname 部门名称, count(e.empno) 部门人数
from emp e  ,dept p
where  p.deptno = e.deptno (+) --左外连接 ， 显示左边的表 不能匹配的信息  注意等号两侧表的顺序
group by   p.deptno , p.dname
order by p.deptno;
select  p.deptno  部门号, p.dname 部门名称, count(e.empno) 部门人数
from emp e  right  join dept p  on e.deptno = p.deptno 
 group by   p.deptno , p.dname
order by p.deptno;
---不推荐这样写，因为dept不是主表。
select  p.deptno  部门号, p.dname 部门名称, count(e.empno) 部门人数
from dept p  left  join  emp e  on e.deptno = p.deptno 
 group by   p.deptno , p.dname
order by p.deptno;

自连接

自连接核心：通过别名，将同一张表视为多张表 ,多表做笛卡儿相等连接。

数据说明 mgr字段

select a.empno 员工工号,
       a.ename 员工姓名,
       a.mgr   领导工号,
       b.empno 领导工号,
       b.ename 领导姓名
  from emp a
  join emp b
    on a.mgr = b.empno
 order by a.empno;

自连接存在的问题和解决办法

问题：不适合大量数据的表

自连接不适合大量数据的表：因为查询同一个表看做多个表，他们的笛卡尔全集的记录数至少为行数的平方。如果看做3个表这是立方关系。假设emp有1亿条数据…做自连接的话可想而知

解决办法：层次查询connetct by

层次查询概述

oracle中的select语句可以用START WITH…CONNECT BY PRIOR子句实现递归查询，connect by 是结构化查询中用到的.

基本语法:

SELECT [LEVEL],column,expression, ...
FROM table
[WHERE conditions]
[[START WITH start_condition] [CONNECT BY PRIOR prior_condition]];

level是伪列，代表树的层级，根节点level为1，子节点为2等。
from后面只能是一个表或一个视图。
where条件可以限制查询返回的行，但不影响层次关系，属于将节点截断，但是这个被截断的节点的下层child不受影响.注意，彻底剪枝条件应放在connect by（connect by之后也可跟过滤条件，它将该条件节点后的所有子孙后代一并去除不显示，如：connect by prior employee_id=manager_id and employee_id>10），单点剪掉条件应放在where子句（入层后不输出）。因为connect by的优先级要高于where，也就是sql引擎先执行connect by。
start_condition定义层次化查询的起点，如employee_id=1。
prior_condition定义父行和子行之间的关系，如父子关系定义为employee_id=manager_id，表示父节点employee_id和子节点manager_id之间存在关系。如果不加PRIOR关键字则不会进行递归，只是相当于一个过滤条件，只能得到根节点。

另外，该关键字可放在前列前，也可放在后列前，放在哪列前哪列就是根节点

如果connect by prior中的prior被省略，则查询将不进行深层递归,只能得到根节点。

select
 level,--是oracle中的伪列，其实在emp表中并没有该字段
 empno, 
 ename, 
 sal, 
 mgr
  from emp
 start with mgr is null  /**只有根节点可以用 is null 这种写法 ，或者 empno=7839 也表示是从根节点开始 .当然也可以从任意节点开始遍历，获取特定子树 .比如遍历  JONES下面的所有子节点 empno=7566**/
 connect by prior empno = mgr
 order by 1;--按照第一个字段排序 即按照levle排序

connect by prior empno = mgr 等号左右两侧的字段顺序不要搞反了。表示父节点empno 和子节点mgr 之间存在关系。

connect by生成序列：

如生成一个1到10的序列：

SELECT ROWNUM FROM DUAL CONNECT BY ROWNUM <= 10;

其原理在于：省略start with则以所以点为根节点，而dual表只有一行所有只有一个节点，而connect by则对所有输入内容进行遍历。

上面的方法受制于rownum伪列的限制，想得到指定始尾的序列我们也可以借助level伪列，如：select level from dual where level >= 5 connect by level <= 10;

自连接和层次查询各有利弊，看使用场景。层次查询：不存在多表查询,但是查询结果没有自查询直观。

子查询

子查询概述

语法

分类

子查询分为单行子查询和多行子查询

select * from emp where sal>(select sal from emp where ename='SCOTT')

子查询需要注意的10个问题

不要忘记子查询语法中的小括号
形成良好的子查询的书写风格
可以使用子查询的位置：Where,select,having,from
不可以使用子查询的位置：group by
强调：from后面的子查询
主查询和子查询可以不是一张表
一般不在自查询中，使用排序；但是在Top-N分析问题中，必须对子查询排序
一般先执行子查询，再执行主查询；但相关子查询例外
单行子查询只能使用单行操作符；多行子查询只能多行操作符
注意：子查询中是Null值的问题

子查询语法中的小括号问题

子查询必须有小括号，否则会抛出ora-00936 :missing expression

子查询的书写风格问题

注意换行和缩进

可以使用子查询的位置

可以使用子查询的位置：Where,select,having,from

where:

select  * 
from emp 
where sal > (select sal 
                       from emp 
                       where ename = 'SCOTT');

select:

having:

select   deptno , avg(sal)
  from emp 
    group by deptno
      having avg(sal) > (select max(sal)
                   from emp
                     where deptno=30);

having不能换成where, 因为where后面不能使用分组函数。

from:

select * 
  from (
       select a.empno ,a.ename ,a.deptno 
         from emp a);

不可以使用子查询的位置

不可以在group by 后使用子查询

from后面的子查询

select *
from (select empno,ename,sal from emp);

select * 
from (select empno,ename,sal,12*sal 年薪 from emp);

主查询和子查询可以不是同一张表

select *
from emp 
where deptno=(select deptno
from dept
where dname='SALES');

当然也可以使用多表查询的方式：

select e.*
from emp e,dept d
where e.deptno=d.deptno and d.dname='SALES' ;

理论上应该尽量使用多表查询，因为上面的子查询有两个from语句，所以要对数据库访问查询两次，而下面的多表查询只访问了一次！这是理论上的结论，并没有考虑实际比如多表查询中产生的笛卡尔积的大小，具体情况还是要具体对待。

子查询的排序问题

select rownum,empno,ename,sal
from (select * from emp order by sal desc)
where rownum<=3;

rownum 行号，oracle提供的伪列。

将排序后的表作为一个集合放到from（）中生成一个新表

重新再查询rownum 就可以让rownum也实现排序了

行号需要注意的两个问题

1、行号永远按照默认的顺序生成
2、行号只能使用<，<=；不能使用>，>=

针对1的情况举个栗子：

我们按照工资排序下，在看下rownum的顺序

即使用order by排序，也不会打乱rownum默认生成行号的顺序。

针对2的情况举个栗子：

可以看到当使用rownum >号时，获取到的结果为空。

主查询和子查询的执行顺序

一般先执行子查询，再执行主查询；但相关子查询例外。

那什么是相关子查询呢？

单行子查询和多行子查询

单行子查询：插叙结果为一个

多行查询：查询结果为两个或两个以上

单行子查询可以使用单行操作符（也可以使用in啊）。

多行子查询只能使用多行操作符。

单行操作符

多行子查询

多行操作符

单行子查询栗子

select * from emp e 
       where e.job = (select job from emp b where b.empno=7566)
            and 
            e.sal > (select sal from emp c where c.empno=7782)

select deptno, min(sal)
  from emp
 group by deptno
having min(sal) > (select min(sal) from emp b where b.deptno = 20)

非法使用单行操作符

多行子查询栗子

多行操作符 in

多行操作符 any

找出员工中，只要比部门号为30的员工中的任何一个员工的工资高的员工信息。也就是说只要比部门号为30的员工中的那个工资最少的员工的工资高就满足条件。

any取的是集合的最小值。

select *
  from emp
 where sal > any (select sal from emp b where b.deptno = 30);

或者

单行操作符表示

select *
  from emp
 where sal > any (select sal from emp b where b.deptno = 30);

多行操作符 all

max取的是集合的最大值。

select *
  from emp
 where sal >  all (select sal from emp b where b.deptno = 30);

select *
  from emp
 where sal >  (select max(sal) from emp b where b.deptno = 30);

子查询中的空值问题 null

单行子查询的null问题

多行子查询的null问题

先看下emp的数据

in相当于 =ANY

not in 相当于 <>ALL（其中如果子查询返回值有NULL，则<>NULL当然没有结果）

ORACLE官方文档：

select * from emp where empno not in (select mgr from emp where mgr is not null);

案例

案例1

select  rn , empno, ename, sal
  from (select rownum rn, empno, ename, sal
          from (select empno, ename, sal from emp order by sal desc) t1
         where rownum <= 8)  t2
 where t2.rn >= 5;

案例2

最开始用的相关子查询做的，

select empno,
       ename,
       sal,
       (select avg(sal) from emp where deptno = e.deptno) avgsal
  from emp e
 where sal > (select avg(sal) from emp where deptno = e.deptno);

现在用多表查询的方式实现下

select e.empno, e.ename, e.sal, s.deptno, s.avgsal
  from emp e, (select deptno, avg(sal) avgsal from emp group by deptno) s
 where e.deptno = s.deptno
   and e.sal > s.avgsal

在pl/sql中，选中sql,按F5查看执行计划

可以看到相关子查询的效果更好一些。

案例3

统计员工的入职年份

使用函数方式

select count(*) Total,
        sum(decode(to_char(hiredate, 'YYYY'), '1980', '1', '0')) "1980",
        sum(decode(to_char(hiredate, 'YYYY'), '1981', '1', '0')) "1981",
        sum(decode(to_char(hiredate, 'YYYY'), '1982', '1', '0')) "1982",
        sum(decode(to_char(hiredate, 'YYYY'), '1987', '1', '0')) "1987"
   from emp;

使用子查询和dual伪表

select (select count(*) from emp) Total,
        (select count(*) from emp where to_char(hiredate, 'YYYY') = '1980') "1980",
        (select count(*) from emp where to_char(hiredate, 'YYYY') = '1981') "1981",
        (select count(*) from emp where to_char(hiredate, 'YYYY') = '1982') "1982",
        (select count(*) from emp where to_char(hiredate, 'YYYY') = '1987') "1987"
   from dual;

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

Oracle高级查询

概述