[20151217]12c标量子查询.txt
--我曾经写过blog,提到许多开发没有根据情况滥用子查询。
--而在12c下呢?
So starting with Oracle 12c, the CBO transformation engine can unnest some types of scalar subqueries and convert those
to Outer joins internally
--我喜欢通过例子来说明问题,看一些例子:
1.建立测试环境:
SCOTT@test01p> @ver1
PORT_STRING VERSION BANNER CON_ID
------------------------------ -------------- -------------------------------------------------------------------------------- ----------
IBMPC/WIN_NT64-9.1.0 12.1.0.1.0 Oracle Database 12c Enterprise Edition Release 12.1.0.1.0 - 64bit Production 0
SCOTT@test01p> create table dept2 as select * from dept;
Table created.
--建立dept2的目的,主要原始的dept上有索引,dept表太小,oracle趋向选择索引。
2.测试1:
SCOTT@test01p> alter session set statistics_level=all;
Session altered.
SCOTT@test01p> select emp.*,(select dname from dept2 where dept2.deptno=emp.deptno) dname from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- --------------
7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 RESEARCH
7499 ALLEN SALESMAN 7698 1981-02-20 00:00:00 1600 300 30 SALES
7521 WARD SALESMAN 7698 1981-02-22 00:00:00 1250 500 30 SALES
7566 JONES MANAGER 7839 1981-04-02 00:00:00 2975 20 RESEARCH
7654 MARTIN SALESMAN 7698 1981-09-28 00:00:00 1250 1400 30 SALES
7698 BLAKE MANAGER 7839 1981-05-01 00:00:00 2850 30 SALES
7782 CLARK MANAGER 7839 1981-06-09 00:00:00 2450 10 ACCOUNTING
7788 SCOTT ANALYST 7566 1987-04-19 00:00:00 3000 20 RESEARCH
7839 KING PRESIDENT 1981-11-17 00:00:00 5000 10 ACCOUNTING
7844 TURNER SALESMAN 7698 1981-09-08 00:00:00 1500 0 30 SALES
7876 ADAMS CLERK 7788 1987-05-23 00:00:00 1100 20 RESEARCH
7900 JAMES CLERK 7698 1981-12-03 00:00:00 950 30 SALES
7902 FORD ANALYST 7566 1981-12-03 00:00:00 3000 20 RESEARCH
7934 MILLER CLERK 7782 1982-01-23 00:00:00 1300 10 ACCOUNTING
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID afcj25qb7mbma, child number 1
-------------------------------------
select emp.*,(select dname from dept2 where dept2.deptno=emp.deptno)
dname from emp
Plan hash value: 1580873685
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 12 (100)| 14 |00:00:00.01 | 8 |
|* 1 | TABLE ACCESS FULL| DEPT2 | 3 | 1 | 3 (0)| 3 |00:00:00.01 | 6 |
| 2 | TABLE ACCESS FULL| EMP | 1 | 14 | 3 (0)| 14 |00:00:00.01 | 8 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPT2"."DEPTNO"=:B1)
--可以发现标量查询cost计算在内的,id=1的starts=3,3+3*3=12。而buffers并没有计算在内。最终buffers=8仅仅计算了扫描emp的buffers。
SCOTT@test01p> alter session set optimizer_features_enable='11.2.0.3';
Session altered.
SCOTT@test01p> select emp.*,(select dname from dept2 where dept2.deptno=emp.deptno) dname from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME
---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- --------------
7369 SMITH CLERK 7902 1980-12-17 00:00:00 800 20 RESEARCH
....
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID afcj25qb7mbma, child number 2
-------------------------------------
select emp.*,(select dname from dept2 where dept2.deptno=emp.deptno)
dname from emp
Plan hash value: 1580873685
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 3 (100)| 14 |00:00:00.01 | 8 |
|* 1 | TABLE ACCESS FULL| DEPT2 | 3 | 1 | 3 (0)| 3 |00:00:00.01 | 9 |
| 2 | TABLE ACCESS FULL| EMP | 1 | 14 | 3 (0)| 14 |00:00:00.01 | 8 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPT2"."DEPTNO"=:B1)
--咋一看上去是一样的,如果仔细看id=1,buffers=9(前面是6),总的cost=3,没有计算了子查询的cost。
--会不会12c改进一些算法,缓存了查询的结果。
3.测试2:
--退出,取消一些参数的影响。
SCOTT@test01p> alter session set statistics_level=all;
Session altered.
SCOTT@test01p> select dept2.*,(select max(sal) from emp where emp.deptno=dept2.deptno) max_sal from dept2;
DEPTNO DNAME LOC MAX_SAL
---------- -------------- ------------- ----------
30 SALES CHICAGO 2850
20 RESEARCH DALLAS 3000
10 ACCOUNTING NEW YORK 5000
40 OPERATIONS BOSTON
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID gzuqru3da815n, child number 0
-------------------------------------
select dept2.*,(select max(sal) from emp where emp.deptno=dept2.deptno)
max_sal from dept2
Plan hash value: 443147980
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 6 (100)| 4 |00:00:00.01 | 10 | | | |
|* 1 | HASH JOIN OUTER | | 1 | 4 | 6 (0)| 4 |00:00:00.01 | 10 | 1321K| 1321K| 1041K (0)|
| 2 | TABLE ACCESS FULL | DEPT2 | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 3 | | | |
| 3 | VIEW | VW_SSQ_1 | 1 | 3 | 3 (0)| 3 |00:00:00.01 | 7 | | | |
| 4 | HASH GROUP BY | | 1 | 3 | 3 (0)| 3 |00:00:00.01 | 7 | 1200K| 1200K| 1194K (0)|
| 5 | TABLE ACCESS FULL| EMP | 1 | 14 | 3 (0)| 14 |00:00:00.01 | 7 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ITEM_1"="DEPT2"."DEPTNO")
--可以发现查询发生了转换,变成 HASH JOIN OUTER 。
SCOTT@test01p> alter session set optimizer_features_enable='11.2.0.3';
Session altered.
SCOTT@test01p> select dept2.*,(select max(sal) from emp where emp.deptno=dept2.deptno) max_sal from dept2;
DEPTNO DNAME LOC MAX_SAL
---------- -------------- ------------- ----------
10 ACCOUNTING NEW YORK 5000
20 RESEARCH DALLAS 3000
30 SALES CHICAGO 2850
40 OPERATIONS BOSTON
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID gzuqru3da815n, child number 0
-------------------------------------
select dept2.*,(select max(sal) from emp where emp.deptno=dept2.deptno)
max_sal from dept2
Plan hash value: 652948006
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 3 (100)| 4 |00:00:00.01 | 4 |
| 1 | SORT AGGREGATE | | 4 | 1 | | 4 |00:00:00.01 | 28 |
|* 2 | TABLE ACCESS FULL| EMP | 4 | 5 | 3 (0)| 14 |00:00:00.01 | 28 |
| 3 | TABLE ACCESS FULL | DEPT2 | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 4 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMP"."DEPTNO"=:B1)
--可以发现11g查询标量子查询成本还是蛮高的,要全表扫描emp4次。我个人不建议使用标量子查询的原因,导致逻辑读很大。
--上面的sql在11g下最好改写成:
select dept2.*,a.max_sal from dept2,(select deptno,max(sal) max_sal from emp group by deptno) a where a.deptno(+)=dept2.deptno;
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 4bf6ut1sp5yjk, child number 0
-------------------------------------
select dept2.*,a.max_sal from dept2,(select deptno,max(sal) max_sal
from emp group by deptno) a where a.deptno(+)=dept2.deptno
Plan hash value: 2488941779
--------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 8 (100)| 4 |00:00:00.01 | 10 | | | |
|* 1 | HASH JOIN OUTER | | 1 | 4 | 8 (25)| 4 |00:00:00.01 | 10 | 1321K| 1321K| 1039K (0)|
| 2 | TABLE ACCESS FULL | DEPT2 | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 3 | | | |
| 3 | VIEW | | 1 | 3 | 4 (25)| 3 |00:00:00.01 | 7 | | | |
| 4 | HASH GROUP BY | | 1 | 3 | 4 (25)| 3 |00:00:00.01 | 7 | 1200K| 1200K| 1197K (0)|
| 5 | TABLE ACCESS FULL| EMP | 1 | 14 | 3 (0)| 14 |00:00:00.01 | 7 | | | |
--------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."DEPTNO"="DEPT2"."DEPTNO")
--可见12c在标量子查询上做了一些改进。但是我不明白为什么测试1不能转换,视乎是仅仅存在某种聚集函数时才会出现转换。
4.测试3:
--退出,取消一些参数的影响。
--看看12c增加了什么参数导致这种转变?
SYS@test01p> @hide scalar
old 10: and lower(a.ksppinm) like lower('%&1%')
new 10: and lower(a.ksppinm) like lower('%scalar%')
NAME DESCRIPTION DEFAULT_VALUE SESSION_VALUE SYSTEM_VALUE
---------------------------------------- ------------------------------------------------------------------ ---------------------- ---------------------- ---------------
_optimizer_unnest_scalar_sq enables unnesting of of scalar subquery TRUE TRUE TRUE
_scalar_type_lob_storage_threshold threshold for VARCHAR2, NVARCHAR2, and RAW storage as BLOB TRUE 4000 4000
--可以猜测是隐含参数_optimizer_unnest_scalar_sq导致这种变化。
SCOTT@test01p> alter session set statistics_level=all;
Session altered.
SCOTT@test01p> alter session set "_optimizer_unnest_scalar_sq"=false;
Session altered.
SCOTT@test01p> select dept2.*,(select max(sal) from emp where emp.deptno=dept2.deptno) max_sal from dept2;
DEPTNO DNAME LOC MAX_SAL
---------- -------------- ------------- ----------
10 ACCOUNTING NEW YORK 5000
20 RESEARCH DALLAS 3000
30 SALES CHICAGO 2850
40 OPERATIONS BOSTON
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID gzuqru3da815n, child number 1
-------------------------------------
select dept2.*,(select max(sal) from emp where emp.deptno=dept2.deptno)
max_sal from dept2
Plan hash value: 652948006
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 (100)| 4 |00:00:00.01 | 4 |
| 1 | SORT AGGREGATE | | 4 | 1 | | 4 |00:00:00.01 | 28 |
|* 2 | TABLE ACCESS FULL| EMP | 4 | 5 | 3 (0)| 14 |00:00:00.01 | 28 |
| 3 | TABLE ACCESS FULL | DEPT2 | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 4 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMP"."DEPTNO"=:B1)
--可以发现执行方式回到了11G。
5.测试4:
--退出,取消一些参数的影响。
--尝试使用hint参数回到11g的执行方式
SCOTT@test01p> alter session set statistics_level=all;
Session altered.
SCOTT@test01p> select /*+ no_unnest(@a) */ dept2.*,(select /*+ qb_name(a) */ max(sal) from emp where emp.deptno=dept2.deptno) max_sal from dept2;
DEPTNO DNAME LOC MAX_SAL
---------- -------------- ------------- ----------
10 ACCOUNTING NEW YORK 5000
20 RESEARCH DALLAS 3000
30 SALES CHICAGO 2850
40 OPERATIONS BOSTON
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID gjk5nkzu2wm67, child number 0
-------------------------------------
select /*+ no_unnest(@a) */ dept2.*,(select /*+ qb_name(a) */ max(sal)
from emp where emp.deptno=dept2.deptno) max_sal from dept2
Plan hash value: 652948006
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 (100)| 4 |00:00:00.01 | 4 |
| 1 | SORT AGGREGATE | | 4 | 1 | | 4 |00:00:00.01 | 28 |
|* 2 | TABLE ACCESS FULL| EMP | 4 | 5 | 3 (0)| 14 |00:00:00.01 | 28 |
| 3 | TABLE ACCESS FULL | DEPT2 | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 4 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMP"."DEPTNO"=:B1)
--非常讨厌这种提示,no_unnest 两次否定就是肯定,就是nest的意思。老外的思维模式就是不一样。
6.测试5:
--退出,取消一些参数的影响。
--max换成count呢?
SCOTT@test01p> select dept2.*,(select count(*) from emp where emp.deptno=dept2.deptno) max_sal from dept2;
DEPTNO DNAME LOC MAX_SAL
---------- -------------- ------------- ----------
10 ACCOUNTING NEW YORK 3
20 RESEARCH DALLAS 5
30 SALES CHICAGO 6
40 OPERATIONS BOSTON 0
SCOTT@test01p> @dpc '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID 2ad909b0xk6fv, child number 1
-------------------------------------
select dept2.*,(select count(*) from emp where emp.deptno=dept2.deptno)
max_sal from dept2
Plan hash value: 652948006
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | Cost (%CPU)| A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 9 (100)| 4 |00:00:00.01 | 4 |
| 1 | SORT AGGREGATE | | 4 | 1 | | 4 |00:00:00.01 | 28 |
|* 2 | TABLE ACCESS FULL| EMP | 4 | 5 | 3 (0)| 14 |00:00:00.01 | 28 |
| 3 | TABLE ACCESS FULL | DEPT2 | 1 | 4 | 3 (0)| 4 |00:00:00.01 | 4 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("EMP"."DEPTNO"=:B1)
--why? 实际上很简单,对比上面的输出就明白了。DEPTNO=40,max_sal的输出是null,而采用记数方式输出是0。
--在做标量子查询如果没有结果,会选择NULL来输出。而count如果没有记录符合,输出是0,这样oracle无法做
--查询转换。
7.看看10053的跟踪:
--有点乱,清除shared_pool。再执行相应sql语句。
SCOTT@test01p> alter system flush shared_pool;
System altered.
...
SYS@test01p> execute dbms_sqldiag.dump_trace(p_sql_id=>'&1',p_child_number=>&2,p_component=>'Compiler',p_file_id=>'DIAG');
Enter value for 1: gzuqru3da815n
Enter value for 2: 0
PL/SQL procedure successfully completed.
--查看跟踪文件:(做了格式化处理)
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "DEPT2"."DEPTNO" "DEPTNO","DEPT2"."DNAME" "DNAME","DEPT2"."LOC" "LOC","VW_SSQ_1"."MAX(SAL)" "MAX_SAL"
FROM (
SELECT MAX("EMP"."SAL") "MAX(SAL)","EMP"."DEPTNO" "ITEM_1"
FROM "SCOTT"."EMP" "EMP"
GROUP BY "EMP"."DEPTNO") "VW_SSQ_1","SCOTT"."DEPT2" "DEPT2"
WHERE "VW_SSQ_1"."ITEM_1"(+) = "DEPT2"."DEPTNO"
SYS@test01p> execute dbms_sqldiag.dump_trace(p_sql_id=>'&1',p_child_number=>&2,p_component=>'Compiler',p_file_id=>'DIAG');
Enter value for 1: 2ad909b0xk6fv
Enter value for 2: 0
PL/SQL procedure successfully completed.
********************
Subquery Unnest (SU)
********************
SU: bypassed: Scalar subquery has null-mutating select item.
SJC: Considering set-join conversion in query block SEL$1 (#0)
--估计这部通不过,无法实现转换。
Final query after transformations:******* UNPARSED QUERY IS *******
SELECT "DEPT2"."DEPTNO" "DEPTNO","DEPT2"."DNAME" "DNAME","DEPT2"."LOC" "LOC", (
SELECT COUNT(*) "COUNT(*)"
FROM "SCOTT"."EMP" "EMP"
WHERE "EMP"."DEPTNO" = "DEPT2"."DEPTNO"
) "MAX_SAL"
FROM "SCOTT"."DEPT2" "DEPT2"
--总结:
1.12c标量子查询可以实现查询转换,仅仅出现在一些聚集函数。
2.受_optimizer_unnest_scalar_sq参数的控制。
3.并不是所有的聚集函数都会出现,比如count。
4.但是不是聚集不会转换。
5.最后一点,我个人觉得要选择合适的场合,而不是不分场合的滥用。