Oracle 反向键索引的原理和用途(减少索引热点块)
我们知道Oracle会自动为表的主键列建立索引,这个默认的索引是普通的B-Tree索引。对于主键值是按顺序(递增或递减)加入的情况,默认的B-Tree索引并不理想。这是因为如果索引列的值具有严格顺序时,随着数据行的插入,索引树的层级增长很快。搜索索引发生的I/O读写次数和索引树的层级数成正比,也就是说,一棵具有5个层级的B -Tree索引,在最终读取到索引数据时最多可能发生多达5次I/O操作。因而,减少索引的层级数是索引性能调整的一个重要方法。
如果索引列的数据以严格的有序的方式插入,那么B-Tree索引树将变成一棵不对称的"歪树",如图 5所示:
而如果索引列的数据以随机值的方式插入,我们将得到一棵趋向对称的索引树,如图 6所示:
比较图 5和图 6,在图 5中搜索到A块需要进行5次I/O操作,而图 6仅需要3次I/O操作。
既然索引列数据从序列中获取,其有序性无法规避,但在建立索引时,Oracle允许对索引列的值进行反向,即预先对列值进行比特位的反向,如 1000,10001,10011,10111,1100经过反向后的值将是0001,1001,1101,0011。显然经过位反向处理的有序数据变得比较随机了,这样所得到的索引树就比较对称,从而提高表的查询性能。
1.反向索引应用场合
1)发现索引叶块成为热点块时使用
通常,使用数据时(常见于批量插入操作)都比较集中在一个连续的数据范围内,那么在使用正常的索引时就很容易发生索引叶子块过热的现象,严重时将会导致系统性能下降。
2)在RAC环境中使用
当RAC环境中几个节点访问数据的特点是集中和密集,索引热点块发生的几率就会很高。如果系统对范围检索要求不是很高的情况下可以考虑使用反向索引技术来提高系统的性能。因此该技术多见于RAC环境,它可以显著的降低索引块的争用。
2.使用反向索引的优点
最大的优点莫过于降低索引叶子块的争用,减少热点块,提高系统性能。
3.使用反向索引的缺点
使用反向键索引的最大的优点莫过于降低索引叶子块的争用,减少热点块,提高系统性能。由于反向键索引自身的特点,如果系统中经常使用范围扫描进行读取数据的话(例如在WHERE子句中使用“BETWEEN AND”语句或比较运算符“>”、“<”、“>=”、“<=”等),那么反向键索引将不适用,因为此时会出现大量的全表扫描的现象,反而会降低系统的性能。只有对反向键索引列进行“=”操作时,其反向键索引才会使用。
有时候可以通过改写sql语句来避免使用范围扫描,例如where id between 12345 and 12347,可以改写为where id in(12345,12346,12347),CBO会把这样的sql查询转换为where id=12345 or id=12346 or id=12347,这对反向索引也是有效的。
4.通过一个小实验简单演示一下反向索引的创建及修改
- SQL> select count(*) from t1;
- COUNT(*)
- ----------
- 0
- SQL> select count(*) from t2;
- COUNT(*)
- ----------
- 0
- SQL> select count(*) from t3;
- COUNT(*)
- ----------
- 2000000
- SQL> select INDEX_NAME,INDEX_TYPE,TABLE_NAME from user_indexes;
- INDEX_NAME INDEX_TYPE TABLE_NAME
- ------------------------------ --------------------------- ------------------------------
- PK_T2 NORMAL/REV T2
- PK_T1 NORMAL T1
- SQL> set timing on;
- SQL> set autotrace on;
- SQL> insert /* +append */ into t1 select * from t3;
- 已创建2000000行。
- 已用时间: 00: 01: 42.83
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 4161002650
- ---------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------
- | 0 | INSERT STATEMENT | | 2316K| 485M| 19014 (1)| 00:03:49 |
- | 1 | LOAD TABLE CONVENTIONAL | T1 | | | | |
- | 2 | TABLE ACCESS FULL | T3 | 2316K| 485M| 19014 (1)| 00:03:49 |
- ---------------------------------------------------------------------------------
- Note
- -----
- - dynamic sampling used for this statement (level=2)
- 统计信息
- ----------------------------------------------------------
- 12305 recursive calls
- 538835 db block gets
- 203937 consistent gets
- 83057 physical reads
- 428323528 redo size
- 688 bytes sent via SQL*Net to client
- 614 bytes received via SQL*Net from client
- 3 SQL*Net roundtrips to/from client
- 2 sorts (memory)
- 0 sorts (disk)
- 2000000 rows processed
- SQL> commit;
- 提交完成。
- 已用时间: 00: 00: 00.04
- SQL> insert /* +append */ into t2 select * from t3;
- 已创建2000000行。
- 已用时间: 00: 02: 02.63
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 4161002650
- ---------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- ---------------------------------------------------------------------------------
- | 0 | INSERT STATEMENT | | 2316K| 485M| 19014 (1)| 00:03:49 |
- | 1 | LOAD TABLE CONVENTIONAL | T2 | | | | |
- | 2 | TABLE ACCESS FULL | T3 | 2316K| 485M| 19014 (1)| 00:03:49 |
- ---------------------------------------------------------------------------------
- Note
- -----
- - dynamic sampling used for this statement (level=2)
- 统计信息
- ----------------------------------------------------------
- 7936 recursive calls
- 6059147 db block gets
- 158053 consistent gets
- 56613 physical reads
- 790167468 redo size
- 689 bytes sent via SQL*Net to client
- 614 bytes received via SQL*Net from client
- 3 SQL*Net roundtrips to/from client
- 2 sorts (memory)
- 0 sorts (disk)
- 2000000 rows processed
- SQL> commit;
- 提交完成。
- 已用时间: 00: 00: 00.01
可以看见:由于反向索引的数据块比较分散了后,db block gets要稍微高一些。热块的争用有所缓解,consistent gets有所下降,从203937下降到158053,减少了45884次。redo size 也变多了!再来做查询,来看看他们的区别。
- SQL> set autotrace traceonly;
- SQL> select OBJECT_NAME from t1 where id = 100;
- 已用时间: 00: 00: 00.06
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1141790563
- -------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 79 | 0 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 79 | 0 (0)| 00:00:01 |
- |* 2 | INDEX UNIQUE SCAN | PK_T1 | 1 | | 0 (0)| 00:00:01 |
- -------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("ID"=100)
- 统计信息
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 4 consistent gets
- 3 physical reads
- 0 redo size
- 434 bytes sent via SQL*Net to client
- 416 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select OBJECT_NAME from t1 where id > 100 and id < 200;
- 已选择99行。
- 已用时间: 00: 00: 01.10
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1249713949
- -------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 99 | 7821 | 1 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 99 | 7821 | 1 (0)| 00:00:01 |
- |* 2 | INDEX RANGE SCAN | PK_T1 | 99 | | 1 (0)| 00:00:01 |
- -------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("ID">100 AND "ID"<200)
- Note
- -----
- - dynamic sampling used for this statement (level=2)
- 统计信息
- ----------------------------------------------------------
- 9 recursive calls
- 0 db block gets
- 140 consistent gets
- 189 physical reads
- 2356 redo size
- 2656 bytes sent via SQL*Net to client
- 482 bytes received via SQL*Net from client
- 8 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 99 rows processed
- SQL> select OBJECT_NAME from t2 where id = 100;
- 已用时间: 00: 00: 00.05
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1480579010
- -------------------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- -------------------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 1 | 79 | 0 (0)| 00:00:01 |
- | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 79 | 0 (0)| 00:00:01 |
- |* 2 | INDEX UNIQUE SCAN | PK_T2 | 1 | | 0 (0)| 00:00:01 |
- -------------------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 2 - access("ID"=100)
- 统计信息
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 4 consistent gets
- 1 physical reads
- 0 redo size
- 434 bytes sent via SQL*Net to client
- 416 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
- SQL> select OBJECT_NAME from t2 where id > 100 and id < 200;
- 已选择99行。
- 已用时间: 00: 00: 04.39
- 执行计划
- ----------------------------------------------------------
- Plan hash value: 1513984157
- --------------------------------------------------------------------------
- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
- --------------------------------------------------------------------------
- | 0 | SELECT STATEMENT | | 336 | 26544 | 8282 (1)| 00:01:40 |
- |* 1 | TABLE ACCESS FULL| T2 | 336 | 26544 | 8282 (1)| 00:01:40 |
- --------------------------------------------------------------------------
- Predicate Information (identified by operation id):
- ---------------------------------------------------
- 1 - filter("ID">100 AND "ID"<200)
- Note
- -----
- - dynamic sampling used for this statement (level=2)
- 统计信息
- ----------------------------------------------------------
- 29 recursive calls
- 1 db block gets
- 60187 consistent gets
- 30335 physical reads
- 5144 redo size
- 2656 bytes sent via SQL*Net to client
- 482 bytes received via SQL*Net from client
- 8 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 99 rows processed
可以看见,单个值查询的时候,表t1和表t2并无差别,但是范围查询的时候,表t1是INDEX RANGE SCAN,表t2是TABLE ACCESS FULL了。在数据库的优化中你经常会发现没有绝对的好,也没有绝对的差。
在考虑使用反向索引之前,大多数情况可以考虑对索引进行散列分区(hash)来减少索引叶块的争用。
反向索引:
alter index id_inx rebuild reverse online;
alter index id_inx rebuild online reverse;
alter index name_inx rebuild online noreverse;
实验代码:
点击(此处)折叠或打开
- CREATE TABLE xt_revi_lhr AS SELECT * FROM dba_objects;
- CREATE INDEX REV_INDEX_lhr ON xt_revi_lhr(object_id) REVERSE;
- CREATE INDEX REV_INDEX_lhr2 ON xt_revi_lhr(UPPER(object_type)) REVERSE;
- CREATE INDEX REV_INDEX_lhr3 ON xt_revi_lhr(object_name) REVERSE;
- SELECT * FROM DBA_INDEXES D WHERE D.INDEX_TYPE LIKE '%/REV';
-
- UPDATE xt_revi_lhr t SET t.object_name='a';
- UPDATE xt_revi_lhr t SET t.object_name='b' WHERE ROWNUM<=1;
-
- SELECT * FROM xt_revi_lhr t WHERE t.object_id=100;
- SELECT * FROM xt_revi_lhr t WHERE t.object_id BETWEEN 100 AND 101;
- SELECT /*+index(t REV_INDEX_lhr)*/ * FROM xt_revi_lhr t WHERE t.object_id BETWEEN 100 AND 101;
- SELECT * FROM xt_revi_lhr t WHERE t.object_id <5;
- SELECT * FROM xt_revi_lhr t WHERE t.object_id >100000000;
- SELECT * FROM xt_revi_lhr t WHERE t.object_id <=5;
- SELECT * FROM xt_revi_lhr t WHERE t.object_id >=100000000;
-
- SELECT * FROM xt_revi_lhr t WHERE t.object_name='b';
About Me
...............................................................................................................................
● 本文整理自网络
● 本文在itpub(http://blog.itpub.net/26736162)、博客园(http://www.cnblogs.com/lhrbest)和个人微信公众号(xiaomaimiaolhr)上有同步更新
● 本文itpub地址:http://blog.itpub.net/26736162/abstract/1/
● 本文博客园地址:http://www.cnblogs.com/lhrbest
● 本文pdf版及小麦苗云盘地址:http://blog.itpub.net/26736162/viewspace-1624453/
● 数据库笔试面试题库及解答:http://blog.itpub.net/26736162/viewspace-2134706/
● QQ群:230161599 微信群:私聊
● 联系我请加QQ好友(646634621),注明添加缘由
● 于 2017-05-09 09:00 ~ 2017-05-30 22:00 在魔都完成
● 文章内容来源于小麦苗的学习笔记,部分整理自网络,若有侵权或不当之处还请谅解
● 版权所有,欢迎分享本文,转载请保留出处
...............................................................................................................................
拿起手机使用微信客户端扫描下边的左边图片来关注小麦苗的微信公众号:xiaomaimiaolhr,扫描右边的二维码加入小麦苗的QQ群,学习最实用的数据库技术。