PostgreSQL 多字段任意组合搜索的性能-阿里云开发者社区

PostgreSQL 多字段任意组合搜索的性能

2017-11-12 3693

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

本文涉及的产品

RDS MySQL Serverless 基础系列，0.5-2RCU 50GB

云数据库 RDS MySQL，集群系列 2核4GB

RDS SQL Server Serverless，2-4RCU 50GB 3个月

简介：

背景

PostgreSQL 多字段，任意组合搜索，有三种优化手段：

1、gin索引（支持任意字段组合的查询）

《宝剑赠英雄 - 任意组合字段等效查询, 探探PostgreSQL多列展开式B树 (GIN)》

2、bloom索引（支持任意只读组合的等值查询）

《PostgreSQL 9.6 黑科技 bloom 算法索引，一个索引支撑任意列组合查询》

3、每个单列btree索引（支持任意字段组合的查询）

《PostgreSQL bitmapAnd, bitmapOr, bitmap index scan, bitmap heap scan》

例子

create table test(c1 int, c2 int, c3 int, c4 int, c5 int);  

        
          
        
        
        
          
          AI 代码解读

bloom, gin, multi-btree几种索引创建方法

1、bloom

postgres=# create extension bloom ;  
CREATE EXTENSION  
postgres=# create index idx_test12_1 on test12 using bloom (c1,c2,c3,c4,c5);  
CREATE INDEX  
postgres=# explain select * from test12 where c1=1;  
                                 QUERY PLAN                                   
----------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=13.95..20.32 rows=8 width=20)  
   Recheck Cond: (c1 = 1)  
   ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..13.95 rows=8 width=0)  
         Index Cond: (c1 = 1)  
(4 rows)  
postgres=# explain select * from test12 where c1=1 and c2=1;  
                                 QUERY PLAN                                   
----------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=18.20..19.42 rows=1 width=20)  
   Recheck Cond: ((c1 = 1) AND (c2 = 1))  
   ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..18.20 rows=1 width=0)  
         Index Cond: ((c1 = 1) AND (c2 = 1))  
(4 rows)  
postgres=# explain select * from test12 where c1=1 or c2=1;  
                                    QUERY PLAN                                      
----------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=27.91..38.16 rows=17 width=20)  
   Recheck Cond: ((c1 = 1) OR (c2 = 1))  
   ->  BitmapOr  (cost=27.91..27.91 rows=17 width=0)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..13.95 rows=8 width=0)  
               Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..13.95 rows=8 width=0)  
               Index Cond: (c2 = 1)  
(7 rows)  

        
          
        
        
        
          
          AI 代码解读

2、gin

postgres=# create extension btree_gin;  
CREATE EXTENSION  
postgres=# create index idx_test12_1 on test12 using gin (c1,c2,c3,c4,c5);  
CREATE INDEX  
postgres=# explain select * from test12 where c1=1 or c2=1;  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=4.94..15.19 rows=17 width=20)  
   Recheck Cond: ((c1 = 1) OR (c2 = 1))  
   ->  BitmapOr  (cost=4.94..4.94 rows=17 width=0)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..2.46 rows=8 width=0)  
               Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..2.46 rows=8 width=0)  
               Index Cond: (c2 = 1)  
(7 rows)  
  
postgres=# explain select * from test12 where c1=1 and c2=1;  
                                QUERY PLAN                                   
---------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=3.60..4.82 rows=1 width=20)  
   Recheck Cond: ((c1 = 1) AND (c2 = 1))  
   ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..3.60 rows=1 width=0)  
         Index Cond: ((c1 = 1) AND (c2 = 1))  
(4 rows)  

        
          
        
        
        
          
          AI 代码解读

3、multi-btree

postgres=# drop index idx_test12_1 ;  
DROP INDEX  
postgres=# create index idx_test12_1 on test12 using btree(c1);  
CREATE INDEX  
postgres=# create index idx_test12_2 on test12 using btree(c2);  
CREATE INDEX  
postgres=# create index idx_test12_3 on test12 using btree(c3);  
CREATE INDEX  
postgres=# create index idx_test12_4 on test12 using btree(c4);  
CREATE INDEX  
postgres=# create index idx_test12_5 on test12 using btree(c5);  
CREATE INDEX  
  
postgres=# explain select * from test12 where c1=1 and c2=1;  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=3.08..4.29 rows=1 width=20)  
   Recheck Cond: ((c2 = 1) AND (c1 = 1))  
   ->  BitmapAnd  (cost=3.08..3.08 rows=1 width=0)  
         ->  Bitmap Index Scan on idx_test12_2  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c2 = 1)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c1 = 1)  
(7 rows)  
  
postgres=# explain select * from test12 where c1=1 or c2=1;  
                                   QUERY PLAN                                      
---------------------------------------------------------------------------------  
 Bitmap Heap Scan on test12  (cost=2.83..13.09 rows=17 width=20)  
   Recheck Cond: ((c1 = 1) OR (c2 = 1))  
   ->  BitmapOr  (cost=2.83..2.83 rows=17 width=0)  
         ->  Bitmap Index Scan on idx_test12_1  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on idx_test12_2  (cost=0.00..1.41 rows=8 width=0)  
               Index Cond: (c2 = 1)  
(7 rows)  

        
          
        
        
        
          
          AI 代码解读

gin, bloom, btree bitmap scan的性能如何呢？

1600个列的宽表，任意字段组合搜索性能

1、建表

postgres=# do language plpgsql $$  
declare  
  sql text;  
begin  
  sql := 'create table test1 (';  
  for i in 1..1600 loop  
    sql := sql||' c'||i||' int2 default random()*100,';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql||')';  
  execute sql;   
    
  for i in 1..1600 loop   
    execute 'create index idx_test1_'||i||' on test1 (c'||i||')';   
  end loop;  
end;  
$$;  
DO  

        
          
        
        
        
          
          AI 代码解读

2、写入测试数据

postgres=# insert into test1 (c1)  select generate_series(1,10000);  
INSERT 0 10000  

        
          
        
        
        
          
          AI 代码解读

3、测试脚本

vi test.sql  
  
\set c2 random(1,100)  
\set c3 random(1,100)  
\set c4 random(1,100)  
\set c5 random(1,100)  
\set c6 random(1,100)  
\set c7 random(1,100)  
select c1600 from test1 where c2=:c2 and c3=:c3 and c4=:c4 or (c5=:c5 and c6=:c6 and c7=:c7);  

        
          
        
        
        
          
          AI 代码解读

4、测试

pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 120  

        
          
        
        
        
          
          AI 代码解读

5、性能

progress: 33.0 s, 208797.8 tps, lat 0.307 ms stddev 0.016  
progress: 34.0 s, 208516.0 tps, lat 0.307 ms stddev 0.032  
progress: 35.0 s, 208574.0 tps, lat 0.307 ms stddev 0.050  
progress: 36.0 s, 208858.2 tps, lat 0.306 ms stddev 0.013  
progress: 37.0 s, 208686.8 tps, lat 0.307 ms stddev 0.043  
progress: 38.0 s, 208764.2 tps, lat 0.307 ms stddev 0.013  

        
          
        
        
        
          
          AI 代码解读

注意，使用prepared statement，可以减少硬解析，提高性能。

从测试来看，任意字段的搜索，可以达到0.3毫秒的响应。

PostgreSQL 多字段任意组合搜索的性能

标签

背景

bloom, gin, multi-btree几种索引创建方法

1600个列的宽表，任意字段组合搜索性能

关系型数据库

热门文章

最新文章

相关产品

相关课程

相关电子书

相关实验场景

推荐镜像

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

PostgreSQL 多字段任意组合搜索的性能

标签

背景

bloom, gin, multi-btree几种索引创建方法

1600个列的宽表，任意字段组合搜索性能

关系型数据库

热门文章

最新文章

相关产品

相关课程

相关电子书

相关实验场景

推荐镜像