PostgreSQL bitmapAnd, bitmapOr, bitmap index scan, bitmap heap scan-阿里云开发者社区

背景

在PostgreSQL中，多个单列索引是可以用在组合查询SQL中的，也就是说实现了bitmap scan。

比如

select * from tbl where c1=1 and c2=1 or c3=1;

用到了3列，如果这3列分别有一个索引，那么PostgreSQL会使用这三个索引的bitmap scan。

PostgreSQL是如何处理多个组合条件的BITMAP SCAN的呢？

Bitmap Heap Scan on customers  (cost=25.76..61.62 rows=10 width=13) (actual time=0.077..0.077 rows=2 loops=1)  
  Recheck Cond: (((username)::text < 'user100'::text) AND (customerid < 1000))  
  ->  BitmapAnd  (cost=25.76..25.76 rows=10 width=0) (actual time=0.073..0.073 rows=0 loops=1)  
        ->  Bitmap Index Scan on ix_cust_username  (cost=0.00..5.75 rows=200 width=0) (actual time=0.006..0.006 rows=2 loops=1)  
              Index Cond: ((username)::text < 'user100'::text)  
        ->  Bitmap Index Scan on customers_pkey  (cost=0.00..19.75 rows=1000 width=0) (actual time=0.065..0.065 rows=999 loops=1)  
              Index Cond: (customerid < 1000)

bitmap scan原理

对于每个查询条件，在对应索引中找到符合条件的堆表PAGE，每个索引构造一个bitmap串。

在这个bitmap串中，每一个BIT位对应一个HEAP PAGE，代表这个HEAP PAGE中有符合该条件的行。

根据条件的多少，组成了多个bitmap。

例如 a=1 or a=2 是两个bitmap。

postgres=# explain select * from tbl where id in (1,2,3);  -- in 可以直接使用index扫描  
                                QUERY PLAN                                  
--------------------------------------------------------------------------  
 Index Scan using idx_tbl on tbl  (cost=0.29..257.15 rows=10000 width=12)  
   Index Cond: (id = ANY ('{1,2,3}'::integer[]))  
(2 rows)  

postgres=# explain select * from tbl where id =1 or id=2 or id=3;  
                                    QUERY PLAN                                      
----------------------------------------------------------------------------------  
 Bitmap Heap Scan on tbl  (cost=124.97..354.97 rows=10000 width=12)  
   Recheck Cond: ((id = 1) OR (id = 2) OR (id = 3))  
   ->  BitmapOr  (cost=124.97..124.97 rows=10000 width=0)  
         ->  Bitmap Index Scan on idx_tbl  (cost=0.00..114.28 rows=10000 width=0)  
               Index Cond: (id = 1)  
         ->  Bitmap Index Scan on idx_tbl  (cost=0.00..1.59 rows=1 width=0)  
               Index Cond: (id = 2)  
         ->  Bitmap Index Scan on idx_tbl  (cost=0.00..1.59 rows=1 width=0)  
               Index Cond: (id = 3)  
(9 rows)  

postgres=# explain select * from tbl where id=1 and id=2 or id=3;  
                                 QUERY PLAN                                   
----------------------------------------------------------------------------  
 Bitmap Heap Scan on tbl  (cost=3.19..4.51 rows=1 width=12)  
   Recheck Cond: (((id = 1) AND (id = 2)) OR (id = 3))  
   ->  BitmapOr  (cost=3.19..3.19 rows=1 width=0)  
         ->  Bitmap Index Scan on idx_tbl  (cost=0.00..1.59 rows=1 width=0)  
               Index Cond: ((id = 1) AND (id = 2))  -- and 可以合并  
         ->  Bitmap Index Scan on idx_tbl  (cost=0.00..1.59 rows=1 width=0)  
               Index Cond: (id = 3)  
(7 rows)

不同字段也一样。

postgres=# create table t12 (c1 int, c2 int, c3 int);  
CREATE TABLE  
postgres=# create index t12_c1 on t12 (c1);  
CREATE INDEX  
postgres=# create index t12_c2 on t12 (c2);  
CREATE INDEX  
postgres=# create index t12_c3 on t12 (c3);  
CREATE INDEX  

postgres=# explain select * from t12 where c1=1 and c2=1 or c3=1;  
                                    QUERY PLAN                                      
----------------------------------------------------------------------------------  
 Bitmap Heap Scan on t12  (cost=4.84..12.36 rows=10 width=12)  
   Recheck Cond: (((c2 = 1) AND (c1 = 1)) OR (c3 = 1))  
   ->  BitmapOr  (cost=4.84..4.84 rows=10 width=0)  
         ->  BitmapAnd  (cost=3.31..3.31 rows=1 width=0)  
               ->  Bitmap Index Scan on t12_c2  (cost=0.00..1.53 rows=10 width=0)  
                     Index Cond: (c2 = 1)  
               ->  Bitmap Index Scan on t12_c1  (cost=0.00..1.53 rows=10 width=0)  
                     Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on t12_c3  (cost=0.00..1.53 rows=10 width=0)  
               Index Cond: (c3 = 1)  
(10 rows)  

postgres=# explain select * from t12 where c1=1 and c2=1 or c3=1 or c3=2;  
                                    QUERY PLAN                                      
----------------------------------------------------------------------------------  
 Bitmap Heap Scan on t12  (cost=6.38..16.78 rows=20 width=12)  
   Recheck Cond: (((c2 = 1) AND (c1 = 1)) OR (c3 = 1) OR (c3 = 2))  
   ->  BitmapOr  (cost=6.38..6.38 rows=20 width=0)  
         ->  BitmapAnd  (cost=3.31..3.31 rows=1 width=0)  
               ->  Bitmap Index Scan on t12_c2  (cost=0.00..1.53 rows=10 width=0)  
                     Index Cond: (c2 = 1)  
               ->  Bitmap Index Scan on t12_c1  (cost=0.00..1.53 rows=10 width=0)  
                     Index Cond: (c1 = 1)  
         ->  Bitmap Index Scan on t12_c3  (cost=0.00..1.53 rows=10 width=0)  
               Index Cond: (c3 = 1)  
         ->  Bitmap Index Scan on t12_c3  (cost=0.00..1.53 rows=10 width=0)  
               Index Cond: (c3 = 2)  
(12 rows)

在生成多个bitmap串后，对这些bitmap串执行BIT & | 操作，得到一个新的BIT串，然后根据这个BIT串，顺序搜索bit=1 (bit=0的heap page不会被扫描) 的对应的数据块(也就是bitmap heap scan)。

因为bitmap index scan返回的是块级别的bit串，所以在bitmap heap scan时还需要recheck。即搜索对应的heap page里的所有tuple(行)，同时对bitmap index scan的条件进行再次过滤。

stackoverflow相关问题

问题

How does PostgreSQL knows by just a bitmap anything about rows' physical order?

回答

The bitmap is one bit per heap page. The bitmap index scan sets the bits based on the heap page address that the index entry points to.  

So when it goes to do the bitmap heap scan, it just does a linear table scan, reading the bitmap to see whether it should bother with a particular page or seek over it.

问题

Or generates the bitmap so that any element of it can be mapped to the pointer to a page easily?

回答

No, the bitmap corresponds 1:1 to heap pages.  

I wrote some more on this here.  

OK, it looks like you might be misunderstanding what "bitmap" means in this context.  

It's not a bit string like "101011" that's created for each heap page, or each index read, or whatever.  

The whole bitmap is a single bit array, with as many bits as there are heap pages in the relation being scanned.  

One bitmap is created by the first index scan, starting off with all entries 0 (false). Whenever an index entry that matches the search condition is found, the heap address pointed to by that index entry is looked up as an offset into the bitmap, and that bit is set to 1 (true). So rather than looking up the heap page directly, the bitmap index scan looks up the corresponding bit position in the bitmap.  

The second and further bitmap index scans do the same thing with the other indexes and the search conditions on them.  

Then each bitmap is ANDed together. The resulting bitmap has one bit for each heap page, where the bits are true only if they were true in all the individual bitmap index scans, i.e. the search condition matched for every index scan. These are the only heap pages we need to bother to load and examine. Since each heap page might contain multiple rows, we then have to examine each row to see if it matches all the conditions - that's what the "recheck cond" part is about.  

One crucial thing to understand with all this is that the tuple address in an index entry points to the row's ctid, which is a combination of the heap page number and the offset within the heap page. A bitmap index scan ignores the offsets, since it'll check the whole page anyway, and sets the bit if any row on that page matches the condition.  

Graphical example  

Heap, one square = one page:  
+---------------------------------------------+  
|c____u_____X___u___X_________u___cXcc______u_|  
+---------------------------------------------+  
Rows marked c match customers pkey condition.  
Rows marked u match username condition.  
Rows marked X match both conditions.  


Bitmap scan from customers_pkey:  
+---------------------------------------------+  
|100000000001000000010000000000000111100000000| bitmap 1  
+---------------------------------------------+  
One bit per heap page, in the same order as the heap  
Bits 1 when condition matches, 0 if not  

Bitmap scan from ix_cust_username:  
+---------------------------------------------+  
|000001000001000100010000000001000010000000010| bitmap 2  
+---------------------------------------------+  
Once the bitmaps are created a bitwise AND is performed on them:  

+---------------------------------------------+  
|100000000001000000010000000000000111100000000| bitmap 1  
|000001000001000100010000000001000010000000010| bitmap 2  
 &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&  
|000000000001000000010000000000000010000000000| Combined bitmap  
+-----------+-------+--------------+----------+  
            |       |              |  
            v       v              v  
Used to scan the heap only for matching pages:  
+---------------------------------------------+  
|___________X_______X______________X__________|  
+---------------------------------------------+  
The bitmap heap scan then seeks to the start of each page and reads the page:  

+---------------------------------------------+  
|___________X_______X______________X__________|  
+---------------------------------------------+  
seek------->^seek-->^seek--------->^  
            |       |              |  
            ------------------------  
            only these pages read  
and each read page is then re-checked against the condition since there can be >1 row per page and not all necessarily match the condition.

哪些查询可能会使用bitmap index scan

1. btree 索引的多个组合查询

不同列的 and\or 查询

相同列的 or 查询

2. brin 索引

由于brin索引本身存储的就是一些连续块的元信息，所以本身就无法实现精确查询，所以通过brin查询时，首先也是构建heap page的bitmap串，（符合条件的为1，不符合条件的为0），然后根据这个bitmap串搜索tuple.

并在bitmap heap scan阶段 recheck 条件。

postgres-# \d cluster_test_brin   
     Unlogged table "public.cluster_test_brin"  
  Column  |            Type             | Modifiers   
----------+-----------------------------+-----------  
 id       | integer                     |   
 info     | text                        |   
 crt_time | timestamp without time zone |   
Indexes:  
    "idx_cluster_test_brin_id" brin (id) WITH (pages_per_range='128')  

postgres=# explain select * from cluster_test_brin where id=1;  
                                         QUERY PLAN                                            
---------------------------------------------------------------------------------------------  
 Bitmap Heap Scan on cluster_test_brin  (cost=115.60..13767.33 rows=10722 width=45)  
   Recheck Cond: (id = 1)  
   ->  Bitmap Index Scan on idx_cluster_test_brin_id  (cost=0.00..112.92 rows=10722 width=0)  
         Index Cond: (id = 1)  
(4 rows)

3. gin 索引

gin 索引存储的是KEY，以及ctid （heap行号）组成的posting list或posting tree，它理论上是可以支持index scan的，但是PostgreSQL目前仅对GIN实施了bitmap scan。

所以在使用gin索引时，首先也是构造heap page的bitmap串，（符合条件的为1，不符合条件的为0），然后根据这个bitmap串搜索tuple.

并在bitmap heap scan阶段 recheck 条件。

这也是目前gin值得改进的地方。

postgres=# \d cluster_test_btree   
     Unlogged table "public.cluster_test_btree"  
  Column  |            Type             | Modifiers   
----------+-----------------------------+-----------  
 id       | integer                     |   
 info     | text                        |   
 crt_time | timestamp without time zone |   
Indexes:  
    "idx_cluster_test_gin" gin (id)  

postgres=# explain select * from cluster_test_btree where id=1;  
                                       QUERY PLAN                                         
----------------------------------------------------------------------------------------  
 Bitmap Heap Scan on cluster_test_btree  (cost=90.83..13732.45 rows=10714 width=45)  
   Recheck Cond: (id = 1)  
   ->  Bitmap Index Scan on idx_cluster_test_gin  (cost=0.00..88.16 rows=10714 width=0)  
         Index Cond: (id = 1)  
(4 rows)