PostgreSQL 设计优化case - 大宽表任意字段组合查询索引如何选择(btree, gin, rum) - (含单个索引列数超过32列的方法)

本文涉及的产品
RDS PostgreSQL Serverless,0.5-4RCU 50GB 3个月
推荐场景:
对影评进行热评分析
云原生数据库 PolarDB 分布式版,标准版 2核8GB
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介: 标签PostgreSQL , adhoc查询 , 大宽表 , 任意字段组合查询 , 索引 , btree , gin , rum背景大宽表,任意字段组合查询,透视。是实时分析系统中的常见需求:1、实时写入。

标签

PostgreSQL , adhoc查询 , 大宽表 , 任意字段组合查询 , 索引 , btree , gin , rum


背景

大宽表,任意字段组合查询,透视。是实时分析系统中的常见需求:

1、实时写入。

2、实时任意字段组合查询、透视。

PostgreSQL中,有多种方法支持这种场景:

《PostgreSQL 9种索引的原理和应用场景》

1、方法1,每个字段加一个索引(普通类型用btree, 多值类型(数组,json,全文检索,枚举等类型)用gin, 空间、范围类型用gist,时序类型用brin)。

2、方法2,使用单个复合索引,把所有字段放到GIN倒排索引接口中。

3、方法3,使用单个复合索引,把所有字段放到RUM索引接口中。

那么这三种方法,应该选哪种更好呢?

或者说选哪种,可以在写入、查询性能上获得更好的平衡。

让单个复合索引支持超过32个字段

注意,PG默认情况下,仅支持一个索引中,最多放32个字段。

通过修改 src/include/pg_config_manual.h ,重新编译,可以支持更多的字段。但是注意,支持的字段越多,索引的头信息会越大。

vi src/include/pg_config_manual.h  
  
/*  
 * Maximum number of columns in an index.  There is little point in making  
 * this anything but a multiple of 32, because the main cost is associated  
 * with index tuple header size (see access/itup.h).  
 *  
 * Changing this requires an initdb.  
 */  
// #define INDEX_MAX_KEYS   32  
#define INDEX_MAX_KEYS   128  
src/include/access/itup.h  
  
/*  
 * Index tuple header structure  
 *  
 * All index tuples start with IndexTupleData.  If the HasNulls bit is set,  
 * this is followed by an IndexAttributeBitMapData.  The index attribute  
 * values follow, beginning at a MAXALIGN boundary.  
 *  
 * Note that the space allocated for the bitmap does not vary with the number  
 * of attributes; that is because we don't have room to store the number of  
 * attributes in the header.  Given the MAXALIGN constraint there's no space  
 * savings to be had anyway, for usual values of INDEX_MAX_KEYS.  
 */  

如果字段允许NULL,则有一个数据结构IndexAttributeBitMapData用来表示哪个字段的值是NULL(与TUPLE的头信息类似,也有NULL BITMAP的表示)。

因此如果字段有NULL,则索引条目的的头信息中,会多出若干BIT,是固定大小的。

typedef struct IndexAttributeBitMapData  
{  
        bits8           bits[(INDEX_MAX_KEYS + 8 - 1) / 8];  
}   

INDEX_MAX_KEYS = 32 时,4字节。

对比每列独立索引、GIN单个全字段复合索引、RUM单个全字段复合索引。

安装测试PostgreSQL软件

1、编译软件

wget https://ftp.postgresql.org/pub/snapshot/dev/postgresql-snapshot.tar.bz2  
tar -jxvf postgresql-snapshot.tar.bz2  
cd postgresql-12devel  
  
参考上面的章节修改源码 src/include/pg_config_manual.h ,支持单个索引放128个字段。  
  
export USE_NAMED_POSIX_SEMAPHORES=1  
LIBS=-lpthread CFLAGS="-O3" ./configure --prefix=/home/digoal/pgsql12  
LIBS=-lpthread CFLAGS="-O3" make world -j 128  
LIBS=-lpthread CFLAGS="-O3" make install-world  

安装rum插件,略。

https://github.com/postgrespro/rum

2、设置环境变量

vi env12.sh  
export PS1="$USER@`/bin/hostname -s`-> "    
export PGPORT=8000  
export PGDATA=/data01/pg/pg_root$PGPORT    
export LANG=en_US.utf8    
export PGHOME=/home/digoal/pgsql12  
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib:$LD_LIBRARY_PATH    
export DATE=`date +"%Y%m%d%H%M"`    
export PATH=$PGHOME/bin:$PATH:.    
export MANPATH=$PGHOME/share/man:$MANPATH    
export PGHOST=$PGDATA    
export PGUSER=postgres    
export PGDATABASE=postgres    
alias rm='rm -i'    
alias ll='ls -lh'    
unalias vi  
  
  
. ./env12.sh  

3、初始化数据库集群

initdb -D $PGDATA -U postgres -X /data02/pg/pg_wal_8000 -E SQL_ASCII --locale=C  

4、修改数据库配置

vi $PGDATA/postgresql.conf  
  
listen_addresses = '0.0.0.0'  
port = 8000  
max_connections = 1000  
unix_socket_directories = '/tmp,.'  
shared_buffers = 32GB  
maintenance_work_mem = 1GB  
dynamic_shared_memory_type = posix  
vacuum_cost_delay = 0  
bgwriter_delay = 10ms  
bgwriter_lru_maxpages = 1000  
bgwriter_lru_multiplier = 10.0  
effective_io_concurrency = 0  
max_worker_processes = 128  
wal_level = replica  
synchronous_commit = off  
wal_buffers = 16MB  
wal_writer_delay = 10ms  
checkpoint_timeout = 35min  
max_wal_size = 64GB  
min_wal_size = 16GB  
checkpoint_completion_target = 0.1  
random_page_cost = 1.1  
log_destination = 'csvlog'  
logging_collector = on  
log_truncate_on_rotation = on  
log_checkpoints = on  
log_connections = on  
log_disconnections = on  
log_error_verbosity = verbose    
log_timezone = 'PRC'  
log_autovacuum_min_duration = 0    
autovacuum_freeze_max_age = 1200000000  
autovacuum_multixact_freeze_max_age = 1400000000  
autovacuum_vacuum_cost_delay = 0ms  
vacuum_freeze_table_age = 1150000000  
vacuum_multixact_freeze_table_age = 1150000000  
datestyle = 'iso, mdy'  
timezone = 'PRC'  
lc_messages = 'C'  
lc_monetary = 'C'  
lc_numeric = 'C'  
lc_time = 'C'  
default_text_search_config = 'pg_catalog.english'  
jit = off  

5、启动数据库

pg_ctl start  

构建测试表

1、分别创建a b c 3张表,结构一样,都是128个字段。按要求创建索引

每列一个BTREE索引

do language plpgsql $$   
declare  
  sql text := 'create table a (';  
begin  
  for i in 1..128 loop  
    sql := sql || 'c' || i || ' int default random()*2000000000,';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql || ')';  
  execute sql;  
  for i in 1..128 loop  
    execute 'create index idx_a_'||i||' on a (c'||i||')';  
  end loop;  
end;  
$$;  

所有列放到一个GIN索引中

(注意GIN索引有FASTUPDATE参数,开启的时候,写入性能很好,但是会延迟合并PENDING TUPLES,写入快了,查询就可能慢(如果在PENDING TUPLES合并慢,并且很大时,越慢),后面会有例子)

create extension btree_gin;  
  
do language plpgsql $$   
declare  
  sql text := 'create table b (';  
begin  
  for i in 1..128 loop  
    sql := sql || 'c' || i || ' int default random()*2000000000,';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql || ')';  
  execute sql;  
    
  sql := 'create index idx_b_1 on b using gin (';  
  for i in 1..128 loop  
    sql := sql || 'c' || i || ',';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql || ')';  
  execute sql;  
end;  
$$;  

所有列放到一个RUM索引中

create extension rum;  
  
do language plpgsql $$   
declare  
  sql text := 'create table c (';  
begin  
  for i in 1..128 loop  
    sql := sql || 'c' || i || ' int default random()*2000000000,';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql || ')';  
  execute sql;  
    
  sql := 'create index idx_c_1 on c using rum (';  
  for i in 1..128 loop  
    sql := sql || 'c' || i || ',';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql || ')';  
  execute sql;  
end;  
$$;  

性能测试

1、创建三个性能测试脚本,分别用于测试三张表的写入性能

vi test_btree.sql  
\set c1 random(1,2000000000)  
insert into a (c1) values (:c1);  
  
  
vi test_gin.sql  
\set c1 random(1,2000000000)  
insert into b (c1) values (:c1);  
  
  
vi test_rum.sql  
\set c1 random(1,2000000000)  
insert into c (c1) values (:c1);  

2、独立128个btree索引的写入性能。

pgbench -M prepared -n -r -P 1 -f ./test_btree.sql -c 28 -j 28 -T 120  
  
transaction type: ./test_btree.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 1906387  
latency average = 1.762 ms  
latency stddev = 0.693 ms  
tps = 15886.268820 (including connections establishing)  
tps = 15887.521828 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set c1 random(1,2000000000)  
         1.761  insert into a (c1) values (:c1);  

1.59万行/s

Samples: 1M of event 'cpu-clock', Event count (approx.): 216077120900   
Overhead  Shared Object        Symbol                                   
  12.07%  postgres             [.] _bt_compare                          
   9.85%  postgres             [.] hash_search_with_hash_value          
   6.44%  [kernel]             [k] _raw_spin_unlock_irqrestore          
   5.33%  postgres             [.] LWLockAcquire                        
   4.17%  [kernel]             [k] __do_softirq                         
   3.90%  [kernel]             [k] run_timer_softirq                    
   3.50%  postgres             [.] PinBuffer                            
   3.20%  postgres             [.] _bt_relandgetbuf                     
   2.24%  libc-2.17.so         [.] __memset_sse2                        
   2.14%  postgres             [.] _bt_moveright                        
   2.07%  postgres             [.] LWLockRelease                        
   1.88%  libc-2.17.so         [.] __memmove_ssse3_back                 
   1.59%  [kernel]             [k] finish_task_switch                   
   1.25%  postgres             [.] LWLockReleaseClearVar                
   1.06%  postgres             [.] MarkBufferDirty                      
   0.99%  libc-2.17.so         [.] __memcpy_ssse3_back   

3、一个GIN复合索引(开启FASTUPDATE)的写入性能

pgbench -M prepared -n -r -P 1 -f ./test_gin.sql -c 28 -j 28 -T 120  
  
transaction type: ./test_gin.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 4791575  
latency average = 0.701 ms  
latency stddev = 32.171 ms  
tps = 39929.190873 (including connections establishing)  
tps = 39932.416884 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.001  \set c1 random(1,2000000000)  
         0.700  insert into b (c1) values (:c1);  

3.99万行/s

Samples: 698K of event 'cpu-clock', Event count (approx.): 119873063495   
Overhead  Shared Object        Symbol                                     
   9.70%  libc-2.17.so         [.] __memset_sse2                          
   7.79%  [kernel]             [k] _raw_spin_unlock_irqrestore            
   3.93%  postgres             [.] ExecInitExprRec                        
   3.13%  postgres             [.] AllocSetAlloc                          
   2.69%  postgres             [.] ExecInitFunc                           
   2.12%  postgres             [.] TupleDescInitEntry                     
   2.04%  [kernel]             [k] finish_task_switch                     
   1.64%  postgres             [.] hash_search_with_hash_value            
   1.57%  postgres             [.] SearchCatCache1                        
   1.56%  libc-2.17.so         [.] bsearch                                
   1.54%  libc-2.17.so         [.] __memcpy_ssse3_back                    
   1.40%  postgres             [.] expression_tree_walker                 
   1.33%  postgres             [.] palloc0                                
   1.29%  [kernel]             [k] run_timer_softirq                      
   1.27%  postgres             [.] dispatch_compare_ptr                   
   1.25%  postgres             [.] fmgr_info                              
   0.96%  postgres             [.] LWLockAcquire                   

4、一个RUM复合索引的写入性能

pgbench -M prepared -n -r -P 1 -f ./test_rum.sql -c 28 -j 28 -T 120  
  
  
transaction type: ./test_rum.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 453539  
latency average = 7.408 ms  
latency stddev = 11.713 ms  
tps = 3779.393984 (including connections establishing)  
tps = 3779.643084 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set c1 random(1,2000000000)  
         7.406  insert into c (c1) values (:c1);  

0.378万行/s

Samples: 260K of event 'cpu-clock', Event count (approx.): 56451610586   
Overhead  Shared Object       Symbol                                     
  18.58%  postgres            [.] computeRegionDelta                     
   5.74%  postgres            [.] GenericXLogRegisterBuffer              
   5.68%  [kernel]            [k] _raw_spin_unlock_irqrestore            
   5.25%  rum.so              [.] rumtuple_get_attrnum                   
   4.11%  [kernel]            [k] __do_softirq                           
   3.51%  postgres            [.] hash_search_with_hash_value            
   3.44%  rum.so              [.] rumFindLeafPage                        
   3.38%  libc-2.17.so        [.] __memcpy_ssse3_back                    
   3.10%  libc-2.17.so        [.] __memset_sse2                          
   2.77%  [kernel]            [k] run_timer_softirq                      
   2.70%  postgres            [.] LWLockAcquire                          
   2.07%  postgres            [.] PinBuffer                              
   1.71%  rum.so              [.] entryLocateLeafEntry                   
   1.49%  postgres            [.] LWLockReleaseClearVar                  
   1.36%  [kernel]            [k] copy_user_enhanced_fast_string         
   1.07%  postgres            [.] LWLockRelease                          
   0.97%  rum.so              [.] entryLocateEntry                       
   0.89%  postgres            [.] UnpinBuffer.constprop.6  

写入性能小结

很显然,性能最好的是GIN(开启FASTUPDATE)时的性能,其次是独立索引,最后是RUM索引。

查询性能

1、都测试了120秒,写入量如下

postgres=# \dt+  
                    List of relations  
 Schema | Name | Type  |  Owner   |  Size   | Description   
--------+------+-------+----------+---------+-------------  
 public | a    | table | postgres | 993 MB  |   
 public | b    | table | postgres | 2497 MB |   
 public | c    | table | postgres | 237 MB  |   
(2 rows)  
  
  
  
postgres=# select count(*) from a;  
  count    
---------  
 1906387  
(1 row)  
  
postgres=# select count(*) from b;  
  count    
---------  
 4791575  
(1 row)  
  
  
postgres=# select count(*) from c;  
 count    
--------  
 453539  
(1 row)  

2、索引大小,很显然GIN索引很大,超过了所有BTREE索引加起来的大小,并且还没有完全合并所有的pending tuples。

postgres=# \di+  
                          List of relations  
 Schema |   Name    | Type  |  Owner   | Table | Size  | Description   
--------+-----------+-------+----------+-------+-------+-------------  
 public | idx_a_1   | index | postgres | a     | 53 MB |   
 public | idx_a_10  | index | postgres | a     | 53 MB |   
 public | idx_a_100 | index | postgres | a     | 54 MB |   
 public | idx_a_101 | index | postgres | a     | 54 MB |   
 public | idx_a_102 | index | postgres | a     | 53 MB |   
 public | idx_a_103 | index | postgres | a     | 54 MB |   
 public | idx_a_104 | index | postgres | a     | 54 MB |   
 public | idx_a_105 | index | postgres | a     | 53 MB |   
 public | idx_a_106 | index | postgres | a     | 53 MB |   
 public | idx_a_107 | index | postgres | a     | 54 MB |   
 public | idx_a_108 | index | postgres | a     | 54 MB |   
 public | idx_a_109 | index | postgres | a     | 54 MB |   
 public | idx_a_11  | index | postgres | a     | 54 MB |   
 public | idx_a_110 | index | postgres | a     | 53 MB |   
 public | idx_a_111 | index | postgres | a     | 53 MB |   
 public | idx_a_112 | index | postgres | a     | 54 MB |   
 public | idx_a_113 | index | postgres | a     | 54 MB |   
 public | idx_a_114 | index | postgres | a     | 53 MB |   
 public | idx_a_115 | index | postgres | a     | 53 MB |   
 public | idx_a_116 | index | postgres | a     | 54 MB |   
 public | idx_a_117 | index | postgres | a     | 53 MB |   
 public | idx_a_118 | index | postgres | a     | 54 MB |   
 public | idx_a_119 | index | postgres | a     | 54 MB |   
 public | idx_a_12  | index | postgres | a     | 54 MB |   
 public | idx_a_120 | index | postgres | a     | 54 MB |   
 public | idx_a_121 | index | postgres | a     | 53 MB |   
 public | idx_a_122 | index | postgres | a     | 53 MB |   
 public | idx_a_123 | index | postgres | a     | 53 MB |   
 public | idx_a_124 | index | postgres | a     | 54 MB |   
 public | idx_a_125 | index | postgres | a     | 54 MB |   
 public | idx_a_126 | index | postgres | a     | 53 MB |   
 public | idx_a_127 | index | postgres | a     | 53 MB |   
 public | idx_a_128 | index | postgres | a     | 54 MB |   
 public | idx_a_13  | index | postgres | a     | 54 MB |   
 public | idx_a_14  | index | postgres | a     | 54 MB |   
 public | idx_a_15  | index | postgres | a     | 54 MB |   
 public | idx_a_16  | index | postgres | a     | 53 MB |   
 public | idx_a_17  | index | postgres | a     | 53 MB |   
 public | idx_a_18  | index | postgres | a     | 53 MB |   
 public | idx_a_19  | index | postgres | a     | 53 MB |   
 public | idx_a_2   | index | postgres | a     | 53 MB |   
 public | idx_a_20  | index | postgres | a     | 53 MB |   
 public | idx_a_21  | index | postgres | a     | 53 MB |   
 public | idx_a_22  | index | postgres | a     | 53 MB |   
 public | idx_a_23  | index | postgres | a     | 53 MB |   
 public | idx_a_24  | index | postgres | a     | 54 MB |   
 public | idx_a_25  | index | postgres | a     | 53 MB |   
 public | idx_a_26  | index | postgres | a     | 54 MB |   
 public | idx_a_27  | index | postgres | a     | 54 MB |   
 public | idx_a_28  | index | postgres | a     | 53 MB |   
 public | idx_a_29  | index | postgres | a     | 54 MB |   
 public | idx_a_3   | index | postgres | a     | 54 MB |   
 public | idx_a_30  | index | postgres | a     | 54 MB |   
 public | idx_a_31  | index | postgres | a     | 53 MB |   
 public | idx_a_32  | index | postgres | a     | 53 MB |   
 public | idx_a_33  | index | postgres | a     | 53 MB |   
 public | idx_a_34  | index | postgres | a     | 54 MB |   
 public | idx_a_35  | index | postgres | a     | 53 MB |   
 public | idx_a_36  | index | postgres | a     | 53 MB |   
 public | idx_a_37  | index | postgres | a     | 53 MB |   
 public | idx_a_38  | index | postgres | a     | 54 MB |   
 public | idx_a_39  | index | postgres | a     | 54 MB |   
 public | idx_a_4   | index | postgres | a     | 54 MB |   
 public | idx_a_40  | index | postgres | a     | 53 MB |   
 public | idx_a_41  | index | postgres | a     | 54 MB |   
 public | idx_a_42  | index | postgres | a     | 53 MB |   
 public | idx_a_43  | index | postgres | a     | 53 MB |   
 public | idx_a_44  | index | postgres | a     | 53 MB |   
 public | idx_a_45  | index | postgres | a     | 53 MB |   
 public | idx_a_46  | index | postgres | a     | 54 MB |   
 public | idx_a_47  | index | postgres | a     | 54 MB |   
 public | idx_a_48  | index | postgres | a     | 54 MB |   
 public | idx_a_49  | index | postgres | a     | 53 MB |   
 public | idx_a_5   | index | postgres | a     | 54 MB |   
 public | idx_a_50  | index | postgres | a     | 54 MB |   
 public | idx_a_51  | index | postgres | a     | 53 MB |   
 public | idx_a_52  | index | postgres | a     | 54 MB |   
 public | idx_a_53  | index | postgres | a     | 54 MB |   
 public | idx_a_54  | index | postgres | a     | 53 MB |   
 public | idx_a_55  | index | postgres | a     | 54 MB |   
 public | idx_a_56  | index | postgres | a     | 54 MB |   
 public | idx_a_57  | index | postgres | a     | 53 MB |   
 public | idx_a_58  | index | postgres | a     | 53 MB |   
 public | idx_a_59  | index | postgres | a     | 53 MB |   
 public | idx_a_6   | index | postgres | a     | 53 MB |   
 public | idx_a_60  | index | postgres | a     | 54 MB |   
 public | idx_a_61  | index | postgres | a     | 53 MB |   
 public | idx_a_62  | index | postgres | a     | 54 MB |   
 public | idx_a_63  | index | postgres | a     | 54 MB |   
 public | idx_a_64  | index | postgres | a     | 54 MB |   
 public | idx_a_65  | index | postgres | a     | 53 MB |   
 public | idx_a_66  | index | postgres | a     | 54 MB |   
 public | idx_a_67  | index | postgres | a     | 53 MB |   
 public | idx_a_68  | index | postgres | a     | 54 MB |   
 public | idx_a_69  | index | postgres | a     | 54 MB |   
 public | idx_a_7   | index | postgres | a     | 54 MB |   
 public | idx_a_70  | index | postgres | a     | 53 MB |   
 public | idx_a_71  | index | postgres | a     | 54 MB |   
 public | idx_a_72  | index | postgres | a     | 54 MB |   
 public | idx_a_73  | index | postgres | a     | 54 MB |   
 public | idx_a_74  | index | postgres | a     | 54 MB |   
 public | idx_a_75  | index | postgres | a     | 54 MB |   
 public | idx_a_76  | index | postgres | a     | 53 MB |   
 public | idx_a_77  | index | postgres | a     | 54 MB |   
 public | idx_a_78  | index | postgres | a     | 53 MB |   
 public | idx_a_79  | index | postgres | a     | 53 MB |   
 public | idx_a_8   | index | postgres | a     | 53 MB |   
 public | idx_a_80  | index | postgres | a     | 53 MB |   
 public | idx_a_81  | index | postgres | a     | 53 MB |   
 public | idx_a_82  | index | postgres | a     | 54 MB |   
 public | idx_a_83  | index | postgres | a     | 53 MB |   
 public | idx_a_84  | index | postgres | a     | 53 MB |   
 public | idx_a_85  | index | postgres | a     | 54 MB |   
 public | idx_a_86  | index | postgres | a     | 53 MB |   
 public | idx_a_87  | index | postgres | a     | 54 MB |   
 public | idx_a_88  | index | postgres | a     | 53 MB |   
 public | idx_a_89  | index | postgres | a     | 53 MB |   
 public | idx_a_9   | index | postgres | a     | 54 MB |   
 public | idx_a_90  | index | postgres | a     | 54 MB |   
 public | idx_a_91  | index | postgres | a     | 54 MB |   
 public | idx_a_92  | index | postgres | a     | 54 MB |   
 public | idx_a_93  | index | postgres | a     | 53 MB |   
 public | idx_a_94  | index | postgres | a     | 53 MB |   
 public | idx_a_95  | index | postgres | a     | 54 MB |   
 public | idx_a_96  | index | postgres | a     | 53 MB |   
 public | idx_a_97  | index | postgres | a     | 53 MB |   
 public | idx_a_98  | index | postgres | a     | 53 MB |   
 public | idx_a_99  | index | postgres | a     | 54 MB |   
 public | idx_b_1   | index | postgres | b     | 17 GB |   
(129 rows)  
  
  
postgres=# \di+  
                          List of relations  
 Schema |  Name   | Type  |  Owner   | Table |  Size   | Description   
--------+---------+-------+----------+-------+---------+-------------  
 public | idx_c_1 | index | postgres | c     | 2250 MB |   
(1 row)  

3、多列查询CASE 1,一个条件选择性好,一个条件选择性差

postgres=# explain analyze select count(*) from a where c1<100000 and c2<10000000;  
                                                    QUERY PLAN                                                       
-------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=104.57..104.58 rows=1 width=8) (actual time=0.332..0.332 rows=1 loops=1)  
   ->  Index Scan using idx_a_1 on a  (cost=0.43..104.56 rows=1 width=0) (actual time=0.327..0.327 rows=0 loops=1)  
         Index Cond: (c1 < 100000)  
         Filter: (c2 < 10000000)  
         Rows Removed by Filter: 92  
 Planning Time: 0.977 ms  
 Execution Time: 0.399 ms  
(7 rows)  
  
postgres=# explain analyze select count(*) from b where c1<100000 and c2<10000000;  
                                                             QUERY PLAN                                                               
------------------------------------------------------------------------------------------------------------------------------------  
 Finalize Aggregate  (cost=351003.54..351003.55 rows=1 width=8) (actual time=752.842..752.842 rows=1 loops=1)  
   ->  Gather  (cost=351003.32..351003.53 rows=2 width=8) (actual time=752.829..752.834 rows=3 loops=1)  
         Workers Planned: 2  
         Workers Launched: 2  
         ->  Partial Aggregate  (cost=350003.32..350003.33 rows=1 width=8) (actual time=546.163..546.164 rows=1 loops=3)  
               ->  Parallel Seq Scan on b  (cost=0.00..349448.64 rows=221871 width=0) (actual time=499.082..546.152 rows=0 loops=3)  
                     Filter: ((c1 < 100000) AND (c2 < 10000000))  
                     Rows Removed by Filter: 1597191  
 Planning Time: 0.669 ms  
 Execution Time: 792.562 ms  
(10 rows)  
  
  
postgres=# explain analyze select count(*) from c where c1<100000 and c2<10000000;  
                                                    QUERY PLAN                                                       
-------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=18.71..18.72 rows=1 width=8) (actual time=0.776..0.776 rows=1 loops=1)  
   ->  Index Scan using idx_c_1 on c  (cost=16.50..18.71 rows=1 width=0) (actual time=0.773..0.773 rows=0 loops=1)  
         Index Cond: ((c1 < 100000) AND (c2 < 10000000))  
 Planning Time: 0.077 ms  
 Execution Time: 0.815 ms  
(5 rows)  

显然,性能最好的是独立索引,因为选择性好的一个就好了,只是少量的FILTER

为什么GIN性能不行呢?理论上应该很好呀。

https://www.postgresql.org/docs/devel/static/pageinspect.html#id-1.11.7.31.7

postgres=# SELECT * FROM gin_metapage_info(get_raw_page('idx_b_1', 0));  
 pending_head | pending_tail | tail_free_size | n_pending_pages | n_pending_tuples | n_total_pages | n_entry_pages | n_data_pages | n_entries | version   
--------------+--------------+----------------+-----------------+------------------+---------------+---------------+--------------+-----------+---------  
       230891 |      1842877 |           5600 |         1587579 |          4205010 |             2 |             1 |            0 |         0 |       2  
(1 row)  

原因,开启了FASTUPDATE的情况下,写入超快,但是合并PENDING TUPLES只有一个AUTOVACUUM WORKER,所以合并没有写入快,导致了pending tuples越来越大。如上,有几百万条还没有合并。

强制合并:

vacuum analyze b;  

耗时需要很久很久,所以实际上开启gin索引的fastupdate,如果写入长时间持续太猛了,并不好。因为合并跟不上,查询性能跟不上。

修改GIN索引的fastupdate,关闭,重测

postgres=#   alter index idx_b_1 set (fastupdate =off);  
ALTER INDEX  

重新测试

transaction type: ./test_gin.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 1638833  
latency average = 1.985 ms  
latency stddev = 0.572 ms  
tps = 13656.742727 (including connections establishing)  
tps = 13657.785073 (excluding connections establishing)  
statement latencies in milliseconds:  
         0.002  \set c1 random(1,2000000000)  
         1.983  insert into b (c1) values (:c1);  

写入速度从3.99万行/s下降到了1.36万行/s

  
Samples: 963K of event 'cpu-clock', Event count (approx.): 191435286728   
Overhead  Shared Object       Symbol                                      
   9.26%  postgres            [.] PinBuffer                               
   8.74%  postgres            [.] LWLockAcquire                           
   7.85%  postgres            [.] gintuple_get_attrnum                    
   6.01%  postgres            [.] UnpinBuffer.constprop.6                 
   5.48%  postgres            [.] hash_search_with_hash_value             
   5.46%  postgres            [.] ginFindLeafPage                         
   4.85%  postgres            [.] LWLockRelease                           
   3.89%  [kernel]            [k] _raw_spin_unlock_irqrestore             
   3.19%  [kernel]            [k] __do_softirq                            
   3.03%  [kernel]            [k] run_timer_softirq                       
   2.89%  postgres            [.] ReadBuffer_common                       
   2.51%  postgres            [.] entryLocateLeafEntry                    
   1.93%  postgres            [.] entryLocateEntry                        
   1.67%  libc-2.17.so        [.] __memset_sse2                           
   1.48%  postgres            [.] gintuple_get_key                        
   1.29%  postgres            [.] LWLockReleaseClearVar                   
   1.21%  libc-2.17.so        [.] __memmove_ssse3_back                    
   1.07%  libc-2.17.so        [.] __memcpy_ssse3_back                     
   0.89%  postgres            [.] MarkBufferDirty                         
   0.89%  postgres            [.] AllocSetAlloc                           
   0.76%  [kernel]            [k] finish_task_switch                      
   0.72%  postgres            [.] XLogInsertRecord            

写入性能下降,比128列独立索引还要慢,可以看到现在没有pending tuples了。

postgres=# SELECT * FROM gin_metapage_info(get_raw_page('idx_b_1', 0));  
 pending_head | pending_tail | tail_free_size | n_pending_pages | n_pending_tuples | n_total_pages | n_entry_pages | n_data_pages | n_entries | version   
--------------+--------------+----------------+-----------------+------------------+---------------+---------------+--------------+-----------+---------  
   4294967295 |   4294967295 |              0 |               0 |                0 |             2 |             1 |            0 |         0 |       2  
(1 row)  

多列查询CASE 1,一个条件选择性好,一个条件选择性差.

现在GIN索引的查询性能还好,但是还是不如128独立列索引。

postgres=# explain analyze select count(*) from b where c1<100000 and c2<10000000;  
                                                      QUERY PLAN                                                         
-----------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=23.13..23.14 rows=1 width=8) (actual time=3.062..3.062 rows=1 loops=1)  
   ->  Bitmap Heap Scan on b  (cost=22.01..23.12 rows=1 width=0) (actual time=3.060..3.060 rows=0 loops=1)  
         Recheck Cond: ((c1 < 100000) AND (c2 < 10000000))  
         ->  Bitmap Index Scan on idx_b_1  (cost=0.00..22.01 rows=1 width=0) (actual time=3.058..3.058 rows=0 loops=1)  
               Index Cond: ((c1 < 100000) AND (c2 < 10000000))  
 Planning Time: 0.053 ms  
 Execution Time: 3.187 ms  
(7 rows)  

4、多列查询CASE 2,两个条件都是大范围,合并后结果集较小

postgres=# explain analyze select count(*) from c where c1<100000000 and c2<100000000;  
                                                          QUERY PLAN                                                             
-------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=1290.52..1290.53 rows=1 width=8) (actual time=20.199..20.200 rows=1 loops=1)  
   ->  Bitmap Heap Scan on c  (cost=29.59..1287.60 rows=1170 width=0) (actual time=18.299..20.091 rows=1078 loops=1)  
         Recheck Cond: ((c1 < 100000000) AND (c2 < 100000000))  
         Heap Blocks: exact=1061  
         ->  Bitmap Index Scan on idx_c_1  (cost=0.00..29.30 rows=1170 width=0) (actual time=18.179..18.179 rows=1078 loops=1)  
               Index Cond: ((c1 < 100000000) AND (c2 < 100000000))  
 Planning Time: 0.183 ms  
 Execution Time: 20.474 ms  
(8 rows)  
  
  
postgres=# explain analyze select count(*) from b where c1<100000000 and c2<100000000;  
                                                          QUERY PLAN                                                             
-------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=4567.42..4567.43 rows=1 width=8) (actual time=105.917..105.918 rows=1 loops=1)  
   ->  Bitmap Heap Scan on b  (cost=65.90..4556.98 rows=4176 width=0) (actual time=81.593..105.503 rows=4098 loops=1)  
         Recheck Cond: ((c1 < 100000000) AND (c2 < 100000000))  
         Heap Blocks: exact=4020  
         ->  Bitmap Index Scan on idx_b_1  (cost=0.00..64.86 rows=4176 width=0) (actual time=81.067..81.067 rows=4098 loops=1)  
               Index Cond: ((c1 < 100000000) AND (c2 < 100000000))  
 Planning Time: 10.918 ms  
 Execution Time: 107.486 ms  
(8 rows)  
  
postgres=# explain analyze select count(*) from a where c1<100000000 and c2<100000000;  
                                                               QUERY PLAN                                                                  
-----------------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=7574.38..7574.39 rows=1 width=8) (actual time=181.375..181.375 rows=1 loops=1)  
   ->  Bitmap Heap Scan on a  (cost=2231.28..7561.98 rows=4959 width=0) (actual time=62.215..180.837 rows=4860 loops=1)  
         Recheck Cond: ((c1 < 100000000) AND (c2 < 100000000))  
         Rows Removed by Index Recheck: 266961  
         Heap Blocks: exact=3166 lossy=17904  
         ->  BitmapAnd  (cost=2231.28..2231.28 rows=4959 width=0) (actual time=61.461..61.461 rows=0 loops=1)  
               ->  Bitmap Index Scan on idx_a_1  (cost=0.00..1090.57 rows=95192 width=0) (actual time=20.603..20.603 rows=95198 loops=1)  
                     Index Cond: (c1 < 100000000)  
               ->  Bitmap Index Scan on idx_a_2  (cost=0.00..1137.98 rows=99314 width=0) (actual time=37.628..37.628 rows=95665 loops=1)  
                     Index Cond: (c2 < 100000000)  
 Planning Time: 3.797 ms  
 Execution Time: 183.723 ms  
(12 rows)  

现在,性能最好的是GIN索引,其次是RUM和独立列索引(性能差不多)。

5、多列查询CASE 3,单个字段大范围,结果集大

postgres=# explain analyze select count(*) from c where c1<100000000 ;  
                                                          QUERY PLAN                                                             
-------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=18671.14..18671.15 rows=1 width=8) (actual time=25.041..25.041 rows=1 loops=1)  
   ->  Index Scan using idx_c_1 on c  (cost=8.80..18614.25 rows=22757 width=0) (actual time=10.142..22.987 rows=22238 loops=1)  
         Index Cond: (c1 < 100000000)  
 Planning Time: 0.099 ms  
 Execution Time: 25.276 ms  
(5 rows)  
  
  
postgres=# explain analyze select count(*) from b where c1<100000000 ;  
                                                            QUERY PLAN                                                              
----------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=63372.32..63372.33 rows=1 width=8) (actual time=139.236..139.236 rows=1 loops=1)  
   ->  Bitmap Heap Scan on b  (cost=684.51..63166.06 rows=82505 width=0) (actual time=57.762..131.941 rows=81741 loops=1)  
         Recheck Cond: (c1 < 100000000)  
         Heap Blocks: exact=58458  
         ->  Bitmap Index Scan on idx_b_1  (cost=0.00..663.89 rows=82505 width=0) (actual time=48.058..48.058 rows=81741 loops=1)  
               Index Cond: (c1 < 100000000)  
 Planning Time: 0.123 ms  
 Execution Time: 141.904 ms  
(8 rows)  
  
postgres=# explain analyze select count(*) from a where c1<100000000 ;  
                                                             QUERY PLAN                                                               
------------------------------------------------------------------------------------------------------------------------------------  
 Aggregate  (cost=78459.78..78459.79 rows=1 width=8) (actual time=109.022..109.023 rows=1 loops=1)  
   ->  Index Only Scan using idx_a_1 on a  (cost=0.43..78221.80 rows=95192 width=0) (actual time=0.028..100.343 rows=95198 loops=1)  
         Index Cond: (c1 < 100000000)  
         Heap Fetches: 95198  
 Planning Time: 0.614 ms  
 Execution Time: 109.053 ms  
(6 rows)  

现在,性能最好的是RUM和独立列索引(性能差不多),其次是GIN索引。

小结

索引模式 写入速度 索引大小 单列查询性能(选择性好) 单列查询性能(选择性不好) 多列查询性能(单列选择性都不好,但是合并后选择性很好) 多列查询性能(有些列选择性很好)
128列单个gin(fastupdate=on)索引 3.99万行/s 最大 不好(pending tuples很大时) 不好(选择性不好本身就不适合用索引过滤) 不好(pending tuples很大时) 不好(pending tuples很大时)
128列单个gin(fastupdate=off)索引 1.36万行/s 最大 不好(选择性不好本身就不适合用索引过滤) 一般
128列单个rum索引 0.378万行/s 较小 很好 不好(选择性不好本身就不适合用索引过滤) 一般
128列128个btree索引 1.59万行/s 较小 很好 不好(选择性不好本身就不适合用索引过滤) 一般

所以,建议使用每列独立的索引,来支撑任意列的组合查询,可以获得写入、读取较好的均衡。

选择性不好的列,建议不要使用索引,选择性通过pg_stats查看

analyze c;

postgres=# select schemaname,tablename,attname,n_distinct from pg_stats where tablename='c';
 schemaname | tablename | attname | n_distinct 
------------+-----------+---------+------------
 public     | c         | c1      |         -1
 public     | c         | c2      |  -0.999025
 public     | c         | c3      |         -1
...

《PostgreSQL SQL自动优化案例 - 极简,自动推荐索引》

《自动选择正确索引访问接口(btree,hash,gin,gist,sp-gist,brin,bitmap...)的方法》

为什么写入性能会是这样的结果?

同步模式BUILD索引,索引越多,写入的影响越大,本文对单表的测试可以看出,128个索引,写入只能达到1.59万行/s。如果硬件资源足够的话,可以通过使用分区表(多个表)来解决写入的瓶颈问题。

异步模式BUILD索引,(GIN开启fastupdate),写入吞吐可以提高,但是如果长时间处于高吞吐的写入下,由于一张表只有一个VACUUM WORKER进程合并pending tuples,所以可能导致PENDING TUPLES会越来越多,导致查询性能下降。解决办法是分区表,这样每个分区可以有1个vacuum worker参与合并,降低延迟。

为什么查询性能在几种情况下会是这样的结果?

选择性好的情况下,三种索引,性能都很好。

多列查询,如果所有条件的选择性都不好,但是最后所有条件的合并选择性如果很好时,GIN内置的BITMAP过滤,使得性能在RUM,GIN,BTREE多列三种情况下达到最好。

多列查询,如果所有条件的选择性都不好,并且最后所有条件合并后的选择性也不好时,所有索引的性能都一般。

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
29天前
|
SQL 关系型数据库 数据库
PostgreSQL性能飙升的秘密:这几个调优技巧让你的数据库查询速度翻倍!
【10月更文挑战第25天】本文介绍了几种有效提升 PostgreSQL 数据库查询效率的方法,包括索引优化、查询优化、配置优化和硬件优化。通过合理设计索引、编写高效 SQL 查询、调整配置参数和选择合适硬件,可以显著提高数据库性能。
157 1
|
4月前
|
缓存 关系型数据库 数据库
PostgreSQL 查询性能
【8月更文挑战第5天】PostgreSQL 查询性能
82 8
|
5月前
|
Java 关系型数据库 API
使用Spring Boot和PostgreSQL构建高级查询
使用Spring Boot和PostgreSQL构建高级查询
|
关系型数据库 分布式数据库 PolarDB
|
关系型数据库 分布式数据库 定位技术
PolarDB for PostgreSQL 开源必读手册-VACUUM处理(中)
PolarDB for PostgreSQL 开源必读手册-VACUUM处理
169 0
|
关系型数据库 分布式数据库 PolarDB
《阿里云产品手册2022-2023 版》——PolarDB for PostgreSQL
《阿里云产品手册2022-2023 版》——PolarDB for PostgreSQL
365 0
|
存储 缓存 关系型数据库
|
存储 SQL 并行计算
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍(中)
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍
426 0
|
存储 算法 安全
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍(下)
PolarDB for PostgreSQL 开源必读手册-开源PolarDB for PostgreSQL架构介绍
384 0
|
关系型数据库 分布式数据库 开发工具

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版