如下SQL为保证所有数据都出自UP_LOAD_SERL10表(后面将以别名T3称呼此表,其他表也已别名称呼)
SELECT
T1.SERL_NO
,T1.SI_YMD
,T1.ST_YMD
,T1.ST2_YMD
,T1.SO_YMD
,T1.ACTU_DT FOTA_DT
,T2.SALE_YMD CHANEL_SO
,T2.REGI_DT CHANEL_SO_REGI_DT
FROM MCS_BI.RW_BI_FT_LT T1
,MCS_HQ.HI_SALE_CHNL T2
,MCS_HQ_READ.UP_LOAD_SERL10 T3
WHERE T1.SERL_NO = T2.SERL_NO(+)
AND T3.SERL_NO = T1.SERL_NO(+);
此SQL运行足有5分钟,T1和T2表数据非常大足都超过千万行,而T3数据很小只有60行,为实现需要得出的结果,三个表在一起必须使用外连接,执行计划如下
Plan hash value: 3150097070
----------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
----------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 63 |00:01:35.15 | 1173K| 1173K|
| 1 | NESTED LOOPS OUTER | | 1 | 63 | 63 |00:01:35.15 | 1173K| 1173K|
|* 2 | HASH JOIN OUTER | | 1 | 63 | 63 |00:01:34.96 | 1173K| 1173K|
| 3 | TABLE ACCESS FULL | UP_LOAD_SERL10 | 1 | 63 | 63 |00:00:00.01 | 7 | 0 |
| 4 | PARTITION RANGE ALL | | 1 | 38M| 38M|00:01:13.83 | 1173K| 1173K|
| 5 | TABLE ACCESS FULL | RW_BI_FT_LT | 122 | 38M| 38M|00:01:03.25 | 1173K| 1173K|
| 6 | TABLE ACCESS BY GLOBAL INDEX ROWID| HI_SALE_CHNL | 63 | 1 | 63 |00:00:00.19 | 191 | 24 |
|* 7 | INDEX UNIQUE SCAN | UX_HI_SALE_CHNL_1 | 63 | 1 | 63 |00:00:00.19 | 128 | 24 |
----------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T3"."SERL_NO"="T1"."SERL_NO")
7 - access("T1"."SERL_NO"="T2"."SERL_NO")
Note
-----
- dynamic sampling used for this statement (level=2)
由于是三个表做外连接,两个连接条件必须同时满足,且T3表又是最小的,优化器选择T1表和T3表先做HASH JOIN OUTER,然后与T1做NEST LOOP OUTER,从计划中可以看出T3和T1做外连接时,T1,T3表的SERL_NO上都没有索引,所以对应都是全表扫描,而T1和T2连接时,T2的SERL_NO上有索引,所以先做了INDEX UNIQUE SCAN,然后通过ROWID回表扫描。这里T1表的返回行数巨大,全表扫描必然很慢,因此应该想办法让T3和T1表连接时,T1的字段能走索引,查看T1表发现上面有主键索引 PK_INDEX 且索引包括两个字段 FOTA_SO_YMD, SERL_NO,因此优化器并没有使用到此索引,于是选择优化方式可以改写SQL让优化器便于查找表上的索引并能使用,对SQl采用如下改写
WITH T_MAST AS (
SELECT T1.SERL_NO
,T1.SI_YMD
,T1.ST_YMD
,T1.ST2_YMD
,T1.SO_YMD
,T1.ACTU_DT FOTA_DT
,T2.SALE_YMD CHANEL_SO
,T2.REGI_DT CHANEL_SO_REGI_DT
FROM MCS_BI.RW_BI_FT_LT T1
,MCS_HQ.HI_SALE_CHNL T2
WHERE T1.SERL_NO = T2.SERL_NO(+)
)
SELECT T_MAST.*
FROM T_MAST,
MCS_HQ_READ.UP_LOAD_SERL10 T2
WHERE T_MAST.SERL_NO(+) = T2.SERL_NO;
对应执行计划
Plan hash value: 2281879921
------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 63 |00:00:23.39 | 96374 | 2705 |
| 1 | NESTED LOOPS OUTER | | 1 | 63 | 63 |00:00:23.39 | 96374 | 2705 |
| 2 | TABLE ACCESS FULL | UP_LOAD_SERL10 | 1 | 63 | 63 |00:00:00.01 | 7 | 0 |
| 3 | VIEW PUSHED PREDICATE | | 63 | 1 | 63 |00:00:23.39 | 96367 | 2705 |
| 4 | NESTED LOOPS OUTER | | 63 | 1 | 63 |00:00:23.39 | 96367 | 2705 |
| 5 | PARTITION RANGE ALL | | 63 | 1 | 63 |00:00:23.17 | 96183 | 2679 |
| 6 | TABLE ACCESS BY LOCAL INDEX ROWID| RW_BI_FT_LT | 7686 | 1 | 63 |00:00:23.16 | 96183 | 2679 |
|* 7 | INDEX SKIP SCAN | PK_RW_BI_FT_LT | 7686 | 1 | 63 |00:00:22.93 | 96121 | 2632 |
| 8 | TABLE ACCESS BY GLOBAL INDEX ROWID| HI_SALE_CHNL | 63 | 1 | 63 |00:00:00.22 | 184 | 26 |
|* 9 | INDEX UNIQUE SCAN | UX_HI_SALE_CHNL_1 | 63 | 1 | 63 |00:00:00.22 | 121 | 26 |
------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("T1"."SERL_NO"="T2"."SERL_NO")
filter("T1"."SERL_NO"="T2"."SERL_NO")
9 - access("T2"."SERL_NO"="T2"."SERL_NO")
filter("T1"."SERL_NO"="T2"."SERL_NO")
Note
-----
- dynamic sampling used for this statement (level=2)
这样原T1,T2关联时就可以使用字段中SERL_No的索引,避免了多个条件并列时优化器采用全表扫描的情况,于是优化后的SQL执行限制在10s内