标签
PostgreSQL , 索引 , 复合索引 , 选择 , 成本 , 优化器
背景
当一个表有很多索引时,并且一个QUERY可以使用到其中的多个索引时,数据库会如何做出选择?最终选择哪个,或者哪几个索引呢?
《PostgreSQL 多查询条件,多个索引的选择算法与问题诊断方法》
选择单个索引时,PATH可以选择index scan , index only scan, bitmap scan。
选择多个索引时,PATH可以选择bitmap scan。
那么优化器会选择哪个,或者哪几个索引,又会选择index scan , index only scan还是 bitmap scan呢?
例子
1、创建4张表,5个字段,分别使用int2,int4,int8,text类型。
create table t1 (c1 int2, c2 int2, c3 int2, c4 int2, c5 int2);
create table t2 (c1 int4, c2 int4, c3 int4, c4 int4, c5 int4);
create table t3 (c1 int8, c2 int8, c3 int8, c4 int8, c5 int8);
create table t4 (c1 text, c2 text, c3 text, c4 text, c5 text);
2、写入测试数据1000万条,其中c3字段为常量
insert into t1 select random()*32767,random()*32767,random()*32767,1,random()*32767 from generate_series(1,10000000);
insert into t2 select random()*2000000000,random()*2000000000,random()*2000000000,1,random()*2000000000 from generate_series(1,10000000);
insert into t3 select random()*2000000000,random()*2000000000,random()*2000000000,1,random()*2000000000 from generate_series(1,10000000);
insert into t4 select md5(random()::text),md5(random()::text),md5(random()::text),'a',md5(random()::text) from generate_series(1,10000000);
3、创建若干索引
create index idx_t1_1 on t1(c1);
create index idx_t1_2 on t1(c1,c2);
create index idx_t1_3 on t1(c1,c2,c3);
create index idx_t1_4 on t1(c4);
create index idx_t2_1 on t2(c1);
create index idx_t2_2 on t2(c1,c2);
create index idx_t2_3 on t2(c1,c2,c3);
create index idx_t2_4 on t2(c4);
create index idx_t3_1 on t3(c1);
create index idx_t3_2 on t3(c1,c2);
create index idx_t3_3 on t3(c1,c2,c3);
create index idx_t3_4 on t3(c4);
create index idx_t4_1 on t4(c1);
create index idx_t4_2 on t4(c1,c2);
create index idx_t4_3 on t4(c1,c2,c3);
create index idx_t4_4 on t4(c4);
4、强制收集统计信息
vacuum analyze t1;
vacuum analyze t2;
vacuum analyze t3;
vacuum analyze t4;
数据库如何选择索引? - 现象
1、输入c1,c2,c3三个字段的等值过滤,作为条件,t1,t2,t3,t4分别选择了哪个索引?
t1选择了c1,c2,c3的合并索引,没有任何多余的filter
postgres=# explain select * from t1 where c1=1 and c2=1 and c3=1;
QUERY PLAN
--------------------------------------------------------------------
Index Scan using idx_t1_3 on t1 (cost=0.43..2.66 rows=1 width=10)
Index Cond: ((c1 = 1) AND (c2 = 1) AND (c3 = 1))
(2 rows)
t2也选择了c1,c2,c3的合并索引,没有任何多余的filter
postgres=# explain select * from t2 where c1=1 and c2=1 and c3=1;
QUERY PLAN
--------------------------------------------------------------------
Index Scan using idx_t2_3 on t2 (cost=0.43..2.66 rows=1 width=20)
Index Cond: ((c1 = 1) AND (c2 = 1) AND (c3 = 1))
(2 rows)
t3选择了c1,c2的合并索引,有多余的c3的filter
postgres=# explain select * from t3 where c1=1 and c2=1 and c3=1;
QUERY PLAN
--------------------------------------------------------------------
Index Scan using idx_t3_2 on t3 (cost=0.43..2.66 rows=1 width=40)
Index Cond: ((c1 = 1) AND (c2 = 1))
Filter: (c3 = 1)
(3 rows)
t4选择了c1,c2的合并索引,有多余的c3的filter
postgres=# explain select * from t4 where c1='1' and c2='1' and c3='a';
QUERY PLAN
---------------------------------------------------------------------
Index Scan using idx_t4_2 on t4 (cost=0.56..2.78 rows=1 width=134)
Index Cond: ((c1 = '1'::text) AND (c2 = '1'::text))
Filter: (c3 = 'a'::text)
(3 rows)
t1,t2,t3,t4的数据记录数,数据分布,SQL、索引都一样,为什么会有以上的选择性差异呢?
数据库如何选择索引? - 分析
1 索引大小分析
postgres=# \di+ idx_t*
List of relations
Schema | Name | Type | Owner | Table | Size | Description
--------+----------+-------+----------+-------+---------+-------------
public | idx_t1_1 | index | postgres | t1 | 214 MB |
public | idx_t1_2 | index | postgres | t1 | 214 MB |
public | idx_t1_3 | index | postgres | t1 | 214 MB |
public | idx_t1_4 | index | postgres | t1 | 214 MB |
public | idx_t2_1 | index | postgres | t2 | 214 MB |
public | idx_t2_2 | index | postgres | t2 | 214 MB |
public | idx_t2_3 | index | postgres | t2 | 301 MB |
public | idx_t2_4 | index | postgres | t2 | 214 MB |
public | idx_t3_1 | index | postgres | t3 | 214 MB |
public | idx_t3_2 | index | postgres | t3 | 301 MB |
public | idx_t3_3 | index | postgres | t3 | 387 MB |
public | idx_t3_4 | index | postgres | t3 | 214 MB |
public | idx_t4_1 | index | postgres | t4 | 563 MB |
public | idx_t4_2 | index | postgres | t4 | 911 MB |
public | idx_t4_3 | index | postgres | t4 | 1266 MB |
public | idx_t4_4 | index | postgres | t4 | 214 MB |
(16 rows)
有没有觉得很奇怪呢?
t1表,从1个字段到3个字段的合并索引,索引大小都一样!!!!!!!而且都与t3表单个字段一样大。
t2表,一个字段与2个字段的索引大小一样大!!!!!2个字段的与T3表的1个字段一样大。3个字段的与T3表的2个字段一样大。
t3表,容量方面看起来很正常。
t4表,c4单个字段的索引,与T3表的单个字段的索引一样大。
看起来好像是这样的:
INDEX TUPLE占用8个字节的倍数。(可能是对齐的考虑)
2 索引内部结构分析
1、采用pageinspect插件,可以看到索引内部的内容
create extension pageinspect;
2、查看索引叶子节点的内容
postgres=# select * from bt_metap('idx_t1_1');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 290 | 2 | 290 | 2
(1 row)
postgres=# select * from bt_page_items('idx_t1_1',290) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+----------+---------+-------+------+-------------------------
1 | (3,1) | 8 | f | f |
2 | (289,1) | 16 | f | f | 54 01 00 00 00 00 00 00
3 | (575,1) | 16 | f | f | a9 02 00 00 00 00 00 00
4 | (860,1) | 16 | f | f | fe 03 00 00 00 00 00 00
5 | (1145,1) | 16 | f | f | 52 05 00 00 00 00 00 00
(5 rows)
postgres=# select * from bt_page_items('idx_t1_1',289) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------------------
1 | (572,1) | 16 | f | f | a9 02 00 00 00 00 00 00
2 | (286,1) | 8 | f | f |
3 | (287,1) | 16 | f | f | 55 01 00 00 00 00 00 00
4 | (288,1) | 16 | f | f | 56 01 00 00 00 00 00 00
5 | (291,1) | 16 | f | f | 57 01 00 00 00 00 00 00
(5 rows)
postgres=# select * from bt_page_items('idx_t1_1',287) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+-------------+---------+-------+------+-------------------------
1 | (38155,112) | 16 | f | f | 56 01 00 00 00 00 00 00
2 | (20034,36) | 16 | f | f | 55 01 00 00 00 00 00 00
3 | (20183,9) | 16 | f | f | 55 01 00 00 00 00 00 00
4 | (20250,124) | 16 | f | f | 55 01 00 00 00 00 00 00
5 | (20957,123) | 16 | f | f | 55 01 00 00 00 00 00 00
(5 rows)
postgres=# select * from t1 where ctid='(38155,112)';
c1 | c2 | c3 | c4 | c5
-----+-------+-------+----+-------
342 | 18698 | 24394 | 1 | 27763
(1 row)
postgres=# select to_hex(342);
to_hex
--------
156
(1 row)
对应56 01 00 00 00 00 00 00
倒过来看 01 56。
确实有一个现象和前面的描述一样: INDEX TUPLE占用8个字节的倍数。
虽然是INT2的类型,本质上只占用2字节,但是实际上INDEX TUPLE占用了8个字节。
3、查看T2表,1个字段的索引。现象也和前面的描述一样: INDEX TUPLE占用8个字节的倍数。
postgres=# select * from bt_metap('idx_t2_1');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 290 | 2 | 290 | 2
(1 row)
postgres=# select * from bt_page_items('idx_t2_1',290) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+----------+---------+-------+------+-------------------------
1 | (3,1) | 8 | f | f |
2 | (289,1) | 16 | f | f | 0f 5c 3c 01 00 00 00 00
3 | (575,1) | 16 | f | f | 8f 8e 7a 02 00 00 00 00
4 | (860,1) | 16 | f | f | c1 f8 b7 03 00 00 00 00
5 | (1145,1) | 16 | f | f | 46 12 f7 04 00 00 00 00
(5 rows)
postgres=# select * from bt_page_items('idx_t2_1',289) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------------------
1 | (572,1) | 16 | f | f | 8f 8e 7a 02 00 00 00 00
2 | (286,1) | 8 | f | f |
3 | (287,1) | 16 | f | f | 5b 8a 3d 01 00 00 00 00
4 | (288,1) | 16 | f | f | 36 a2 3e 01 00 00 00 00
5 | (291,1) | 16 | f | f | 4b ca 3f 01 00 00 00 00
(5 rows)
postgres=# select * from bt_page_items('idx_t2_1',572) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+------------+---------+-------+------+-------------------------
1 | (3798,12) | 16 | f | f | 3d b3 7b 02 00 00 00 00
2 | (48744,46) | 16 | f | f | 8f 8e 7a 02 00 00 00 00
3 | (40279,76) | 16 | f | f | 1a 8f 7a 02 00 00 00 00
4 | (53268,39) | 16 | f | f | a9 8f 7a 02 00 00 00 00
5 | (16540,92) | 16 | f | f | 35 90 7a 02 00 00 00 00
(5 rows)
postgres=# select * from t2 where ctid='(3798,12)';
c1 | c2 | c3 | c4 | c5
----------+-----------+----------+----+------------
41661245 | 940376658 | 41196565 | 1 | 1467443120
(1 row)
postgres=# select to_hex(41661245);
to_hex
---------
27bb33d
(1 row)
对应3d b3 7b 02 00 00 00 00
倒过来看 02 7b b3 3d。
4、查看T2表,3个字段的索引。现象也和前面的描述一样: INDEX TUPLE占用8个字节的倍数。
postgres=# select * from bt_metap('idx_t2_3');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 209 | 2 | 209 | 2
(1 row)
postgres=# select * from bt_page_items('idx_t2_3',209) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------------------------------------------
1 | (3,1) | 8 | f | f |
2 | (208,1) | 24 | f | f | a8 2c a1 00 78 82 d4 3d 55 1b a6 67 00 00 00 00
3 | (413,1) | 24 | f | f | a8 9d 42 01 3c fe ab 6b e7 69 81 1c 00 00 00 00
4 | (617,1) | 24 | f | f | c8 f7 e5 01 eb cd b6 31 49 90 14 6f 00 00 00 00
5 | (821,1) | 24 | f | f | e7 b6 86 02 9c 1e 63 65 c4 c3 06 48 00 00 00 00
(5 rows)
postgres=# select * from bt_page_items('idx_t2_3',208) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+---------+---------+-------+------+-------------------------------------------------
1 | (410,1) | 24 | f | f | a8 9d 42 01 3c fe ab 6b e7 69 81 1c 00 00 00 00
2 | (205,1) | 8 | f | f |
3 | (206,1) | 24 | f | f | b5 ef a1 00 bf b0 c5 14 94 4f ae 63 00 00 00 00
4 | (207,1) | 24 | f | f | 1b b1 a2 00 0a f5 c6 33 c4 09 e3 12 00 00 00 00
5 | (210,1) | 24 | f | f | 34 81 a3 00 82 74 a7 42 fe d6 33 06 00 00 00 00
(5 rows)
postgres=# select * from bt_page_items('idx_t2_3',410) limit 5;
itemoffset | ctid | itemlen | nulls | vars | data
------------+-------------+---------+-------+------+-------------------------------------------------
1 | (61769,131) | 24 | f | f | a6 6b 43 01 83 ab a4 04 0c a4 9e 29 00 00 00 00
2 | (44927,154) | 24 | f | f | a8 9d 42 01 3c fe ab 6b e7 69 81 1c 00 00 00 00
3 | (24587,3) | 24 | f | f | e3 9d 42 01 4a 9f 46 25 b9 d2 8e 47 00 00 00 00
4 | (29996,51) | 24 | f | f | be 9e 42 01 66 33 fd 03 8c 8c 39 2c 00 00 00 00
5 | (4032,73) | 24 | f | f | 5c 9f 42 01 ea 4d b1 01 1d af 88 32 00 00 00 00
(5 rows)
postgres=# select * from t2 where ctid='(61769,131)';
c1 | c2 | c3 | c4 | c5
----------+----------+-----------+----+------------
21195686 | 77900675 | 698262540 | 1 | 1522991839
(1 row)
postgres=# select to_hex(21195686);
to_hex
---------
1436ba6
(1 row)
postgres=# select to_hex(77900675);
to_hex
---------
4a4ab83
(1 row)
postgres=# select to_hex(698262540);
to_hex
----------
299ea40c
(1 row)
对应a6 6b 43 01 83 ab a4 04 0c a4 9e 29 00 00 00 00
每个字段占用4字节,倒过来看 01 43 6b a6。04 a4 ab 83。29 9e a4 0c。
但是这些个索引选择有什么关系吗?
本身没什么关系,但是由于这样出现了一个问题,索引本身占用的大小,在评估成本时也在成本计算环节之一。
HINT , 成本对比
最后数据库选什么索引,其实还是取决于COST。
每个PATH的COST都会算一遍,哪个最低选哪个。
使用hint来强制使用某个索引,这样就能对比出COST到底差多少?
postgres=# set pg_hint_plan.debug_print =on;
SET
postgres=# set pg_hint_plan.enable_hint=on;
SET
postgres=# set pg_hint_plan.message_level =log;
SET
postgres=# set client_min_messages =log;
SET
以t4表为例
postgres=# explain (analyze,verbose,timing,costs,buffers) /*+ IndexScan(t4 idx_t4_3) */ select * from t4 where c1='1' and c2='1' and c3='a';
LOG: available indexes for IndexScan(t4): idx_t4_3
LOG: pg_hint_plan:
used hint:
IndexScan(t4 idx_t4_3)
not used hint:
duplication hint:
error hint:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using idx_t4_3 on public.t4 (cost=0.69..2.91 rows=1 width=134) (actual time=0.012..0.012 rows=0 loops=1)
Output: c1, c2, c3, c4, c5
Index Cond: ((t4.c1 = '1'::text) AND (t4.c2 = '1'::text) AND (t4.c3 = 'a'::text))
Buffers: shared hit=5
Planning time: 0.188 ms
Execution time: 0.027 ms
(6 rows)
postgres=# explain (analyze,verbose,timing,costs,buffers) /*+ IndexScan(t4 idx_t4_2) */ select * from t4 where c1='1' and c2='1' and c3='a';
LOG: available indexes for IndexScan(t4): idx_t4_2
LOG: pg_hint_plan:
used hint:
IndexScan(t4 idx_t4_2)
not used hint:
duplication hint:
error hint:
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using idx_t4_2 on public.t4 (cost=0.56..2.78 rows=1 width=134) (actual time=0.012..0.012 rows=0 loops=1)
Output: c1, c2, c3, c4, c5
Index Cond: ((t4.c1 = '1'::text) AND (t4.c2 = '1'::text))
Filter: (t4.c3 = 'a'::text)
Buffers: shared hit=5
Planning time: 0.212 ms
Execution time: 0.027 ms
(7 rows)
很显然,数据库评估出来选择idx_t4_2的成本为2.78,使用idx_t4_3的成本为2.91。
理所当然,选择了idx_t4_2。虽然它多了一次filter。
那么为什么idx_t4_2的成本更低呢?
public | idx_t4_2 | index | postgres | t4 | 911 MB |
public | idx_t4_3 | index | postgres | t4 | 1266 MB |
1、走idx_t4_2索引时,评估出来扫描的数据块比idx_t4_3更少,虽然它多了filter,但是filter引入的开销在 RANDOM PAGE SCAN面前小得多(并且实际上这里没有filter的成本,因为c1,c2两个条件下去后评估出来的ROWS=0 select * from t4 where c1='1' and c2='1';
)。
2、走idx_t4_3索引时,评估出来不需要filter,但是这个索引比idx_t4_2占用的空间大,按比例划分时,算了更多的random page SCAN 成本。即使没有额外的filter,成本也比idx_t4_2更高。
PS:
这样的评估算法应该还有优化的空间,因为实际上两个索引的LEVEL是一样的。
postgres=# select * from bt_metap('idx_t2_2');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 290 | 2 | 290 | 2
(1 row)
postgres=# select * from bt_metap('idx_t2_3');
magic | version | root | level | fastroot | fastlevel
--------+---------+------+-------+----------+-----------
340322 | 2 | 209 | 2 | 209 | 2
(1 row)
以上,使用idx_t2_2和idx_t2_3扫描的INDEX PAGE数实际上是一样的。
小结
PostgreSQL 采用CBO的优化器模型,哪个PATH成本低,就使用哪个PATH。
在有多个索引可以使用时,实际上数据库会计算每一种的成本,哪种成本低,选择哪个。
影响成本的因素很多:
算子设置,选择性,扫描成本,INDEX TUPLE过滤成本,HEAP TUPLE过滤成本。
成本是综合的结果。
思考:
select * from tbl where c1=1 and c2=1 and c3=1 and c4=1;
idx(c1,c2)
idx(c2,c3)
idx(c4,c3,c1)
选哪个索引?需要考虑什么因素?
提示:
多列统计信息,统计信息,多列条件合并选择性、索引大小、RANDOM SCAN成本,SEQ SCAN成本,FILTER的成本,成本算子参数,优化器开关等。
参考
src/backend/optimizer/path/costsize.c
《PostgreSQL 多查询条件,多个索引的选择算法与问题诊断方法》
《PostgreSQL 自定义函数表达式选择性评估算法 - Statistics, Cardinality, Selectivity, Estimate》
《关键时刻HINT出彩 - PG优化器的参数优化、执行计划固化CASE》
《优化器成本因子校对 - PostgreSQL explain cost constants alignment to timestamp》
https://www.postgresql.org/docs/11/static/planner-stats-details.html