在SQL tuning中,不良写法导致SQL执行效率比比皆是。最近的SQL tuning中一个外部连接写法不当导致过SQL执行时间超过15分钟左右此即
为一例。通过修正该SQL的写法之后,过高的逻辑读呈数量级下降以及SQL语句执行时间也大幅下降。下面给出一个列子来演示该情形。
一、创建演示环境
-->当前数据库版本 SQL> select * from v$version where rownum<2; BANNER ---------------------------------------------------------------- Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - Prod -->创建演示表并插入记录 SQL> create table t as select empno,ename,job,sal,deptno 2 from emp where 1=2; SQL> insert into t select empno,ename,job,sal,deptno 2 from emp e where empno=(select max(empno) from emp where deptno=e.deptno); SQL> insert into t(empno,ename,job,sal) values(8888,'ROBINSON','DBA',2000); SQL> insert into t(empno,ename,job,sal) values(9999,'JACKSON','CLERK',2500); SQL> commit; SQL> analyze table t compute statistics; SQL> select * from t; EMPNO ENAME JOB SAL DEPTNO ---------- ---------- --------- ---------- ---------- 7934 MILLER CLERK 1300 10 7902 FORD ANALYST 3000 20 7900 JAMES CLERK 950 30 8888 ROBINSON DBA 2000 9999 JACKSON CLERK 2500 -->使用left join连接查看数据,此时表t中所有记录被返回 SQL> select empno,ename,sal,dname from t left join dept d on t.deptno=d.deptno; EMPNO ENAME SAL DNAME ---------- ---------- ---------- -------------- 7934 MILLER 1300 ACCOUNTING 7902 FORD 3000 RESEARCH 7900 JAMES 950 SALES 8888 ROBINSON 2000 9999 JACKSON 2500 -->下面同样是使用left join连接,但在on子句中增加了过滤条件t.sal>=2000 -->从下面的返回结果可知,t.sal>=2000子句并没有过滤掉sal小于2000的记录 SQL> select empno,ename,sal,dname from t left join dept d -->简称语句A 2 on(t.deptno=d.deptno and t.sal>=2000); EMPNO ENAME SAL DNAME ---------- ---------- ---------- -------------- 7934 MILLER 1300 7902 FORD 3000 RESEARCH 7900 JAMES 950 8888 ROBINSON 2000 9999 JACKSON 2500 -->使用left join连接,将过滤条件放到where 子句中 -->此时仅仅t.sal>=2000且符合t.deptno=d.deptno的记录被返回(结果与所期望一致) SQL> select empno,ename,sal,dname from t left join dept d -->简称语句B 2 on t.deptno=d.deptno where t.sal>=2000; EMPNO ENAME SAL DNAME ---------- ---------- ---------- -------------- 7902 FORD 3000 RESEARCH 8888 ROBINSON 2000 9999 JACKSON 2500 -->查看执行计划 SQL> set autotrace traceonly exp; -->语句A(过滤条件位于on 子句中的情形)的执行计划 SQL> select empno,ename,sal,dname from t left join dept d 2 (on t.deptno=d.deptno and t.sal>=2000); Execution Plan ---------------------------------------------------------- Plan hash value: 2195752858 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 120 | 6 (0)| 00:00:01 | | 1 | NESTED LOOPS OUTER | | 5 | 120 | 6 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | T | 5 | 70 | 3 (0)| 00:00:01 | | 3 | VIEW | | 1 | 10 | 1 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T"."DEPTNO"="D"."DEPTNO") -->重点关注这里的谓词信息,两个过滤条件合在一起 filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000) -->从执行计划来看位于第5步为INDEX UNIQUE SCAN Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 11 consistent gets -->此时的逻辑读为11 0 physical reads 0 redo size 696 bytes sent via SQL*Net to client 385 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 5 rows processed -->语句B(将谓词信息置于到where子句中的情形)的执行计划 SQL> select empno,ename,sal,dname from t left join dept d 2 on t.deptno=d.deptno where t.sal>=2000; Execution Plan ---------------------------------------------------------- Plan hash value: 832694258 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 81 | 4 (0)| 00:00:01 | | 1 | NESTED LOOPS OUTER | | 3 | 81 | 4 (0)| 00:00:01 | |* 2 | TABLE ACCESS FULL | T | 3 | 42 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("T"."SAL">=2000) -->此时的谓词信息分为两部分,"T"."SAL">=2000位于第二步 4 - access("T"."DEPTNO"="D"."DEPTNO"(+)) -->此条谓词信息用于实现表连接 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 10 consistent gets -->此时的逻辑读为10,由于2 - filter("T"."SAL">=2000)过滤后,内部循环少执行了一次 0 physical reads 0 redo size 658 bytes sent via SQL*Net to client 385 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3 rows processed /**************************************************/ /* Author: Robinson Cheng */ /* Blog: http://blog.csdn.net/robinson_0612 */ /* MSN: robinson_0612@hotmail.com */ /* QQ: 645746311 */ /**************************************************/ -->从上面的观察中发现上述两条SQL语句执行计划并非最佳,存在改良的余地 -->由于是nested loops outer,因此考虑在表t的谓词列增加索引以快速过滤记录 SQL> create index i_t_sal on t(sal); SQL> exec dbms_stats.gather_table_stats('SCOTT','T',cascade=>true); -->增加索引后两个语句的执行情况 -->语句A的执行计划以及统计信息没有发生任何变化 SQL> select empno,ename,sal,dname from t left join dept d 2 on (t.deptno=d.deptno and t.sal>=2000); Execution Plan ---------------------------------------------------------- Plan hash value: 2195752858 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 140 | 6 (0)| 00:00:01 | | 1 | NESTED LOOPS OUTER | | 5 | 140 | 6 (0)| 00:00:01 | | 2 | TABLE ACCESS FULL | T | 5 | 90 | 3 (0)| 00:00:01 | | 3 | VIEW | | 1 | 10 | 1 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("T"."DEPTNO"="D"."DEPTNO") filter("T"."DEPTNO" IS NOT NULL AND "T"."SAL">=2000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 11 consistent gets 0 physical reads 0 redo size 696 bytes sent via SQL*Net to client 385 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 5 rows processed -->语句B的执行计划发生变化,原来的全表扫描变为索引扫描 SQL> select empno,ename,sal,dname from t left join dept d 2 on t.deptno=d.deptno where t.sal>=2000; Execution Plan ---------------------------------------------------------- Plan hash value: 2452308905 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 93 | 3 (0)| 00:00:01 | | 1 | NESTED LOOPS OUTER | | 3 | 93 | 3 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID| T | 3 | 54 | 2 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | I_T_SAL | 3 | | 1 (0)| 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 5 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("T"."SAL">=2000) 5 - access("T"."DEPTNO"="D"."DEPTNO"(+)) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 6 consistent gets -->逻辑读也由10下降到6 0 physical reads 0 redo size 658 bytes sent via SQL*Net to client 385 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3 rows processed
二、总结
1、尽可能避免SQL不良写法导致的不良后果
2、此例中由于将谓词信息放到ON子句中,在数据量庞大的表(百万行)连接中,则该写法导致过多的物理和逻辑I/O,使得中间结果集庞大
3、谓词信息放到ON子句中同时也导致索引失效
4、尽可能的在满足需求的情况下减小中间结果集
三、更多参考
dbms_xplan之display_cursor函数的使用