修改一行代码提升 Postgres 性能 100 倍？

不久前看了一篇名为：修改一行代码提升 Postgres 性能 100 倍的文章。
http://www.iteye.com/news/28235

今天自己碰到类似的查询，就又回去看了一遍，发现我和他的场景还是比较类似的。所以索性就模拟一下。

先介绍一下我的环境，首先呢，是一张交易表transaction，有transaction_id varchar(32)，user_id int4，等字段。其中有1100万条数据。transaction_id做了唯一btree索引。

另一张是用户表user，id int4是主键。

现在要查出指定1000个transaction_id的交易记录和用户信息，下面是我的查询语句和执行计划。
--(1)
EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM transaction t
JOIN user u ON t.user_id = u.id
WHERE transaction_id IN ('p1605170001BC33752830','p160519000129B89658337','p1603280001EE231575962',...和其它996个...,'p1604190001F9188378917')
AND t.user_id = 5392;

Nested Loop (cost=4580.87..4594.16 rows=1 width=2072) (actual time=17.694..17.706 rows=4 loops=1)
Buffers: shared hit=4038
-> Bitmap Heap Scan on transaction t (cost=4580.58..4585.84 rows=1 width=470) (actual time=17.686..17.690 rows=4 loops=1)

    Recheck Cond: ((user_id = 5392) AND ((transaction_id)::text = ANY ('{p1605170001BC33752830,p160519000129B89658337,p1603280001EE231575962,...和其它996个...,p1604190001F9188378917}'::text[])))
    Heap Blocks: exact=4
    Buffers: shared hit=4026
    ->  BitmapAnd  (cost=4580.58..4580.58 rows=1 width=0) (actual time=17.680..17.680 rows=0 loops=1)
          Buffers: shared hit=4022
          ->  Bitmap Index Scan on fk_transaction_user_id  (cost=0.00..34.33 rows=1852 width=0) (actual time=0.129..0.129 rows=1001 loops=1)
                Index Cond: (user_id = 5392)
                Buffers: shared hit=7
          ->  Bitmap Index Scan on uk_transaction  (cost=0.00..4546.00 rows=1000 width=0) (actual time=17.482..17.482 rows=1000 loops=1)
                Index Cond: ((transaction_id)::text = ANY ('{p1605170001BC33752830,p160519000129B89658337,p1603280001EE231575962,...和其它996个...,p1604190001F9188378917}'::text[]))
                Buffers: shared hit=4015

-> Index Scan using user_pkey on user u (cost=0.29..8.30 rows=1 width=1602) (actual time=0.002..0.002 rows=1 loops=4)

    Index Cond: (id = 5392)
    Buffers: shared hit=12

Planning time: 0.734 ms
Execution time: 22.647 ms

--(2)
EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM transaction t
JOIN user u ON t.user_id = u.id
WHERE transaction_id = ANY ('{p1605170001BC33752830,p160519000129B89658337,p1603280001EE231575962,...和其它996个...,p1604190001F9188378917}'::TEXT[])
AND t.user_id = 5392;

Nested Loop (cost=4565.40..4578.69 rows=1 width=2064) (actual time=17.698..17.710 rows=4 loops=1)
Buffers: shared hit=4037
-> Bitmap Heap Scan on transaction t (cost=4565.11..4570.37 rows=1 width=462) (actual time=17.688..17.691 rows=4 loops=1)

    Recheck Cond: ((user_id = 5392) AND ((transaction_id)::text = ANY ('{p1605170001BC33752830,p160519000129B89658337,p1603280001EE231575962,...和其它996个...,p1604190001F9188378917}'::text[])))
    Heap Blocks: exact=4
    Buffers: shared hit=4025
    ->  BitmapAnd  (cost=4565.11..4565.11 rows=1 width=0) (actual time=17.683..17.683 rows=0 loops=1)
          Buffers: shared hit=4021
          ->  Bitmap Index Scan on fk_transaction_user_id  (cost=0.00..25.36 rows=1190 width=0) (actual time=0.169..0.169 rows=1001 loops=1)
                Index Cond: (user_id = 5392)
                Buffers: shared hit=7
          ->  Bitmap Index Scan on uk_transaction  (cost=0.00..4539.50 rows=1000 width=0) (actual time=17.445..17.445 rows=1000 loops=1)
                Index Cond: ((transaction_id)::text = ANY ('{p1605170001BC33752830,p160519000129B89658337,p1603280001EE231575962,...和其它996个...,p1604190001F9188378917}'::text[]))
                Buffers: shared hit=4014

-> Index Scan using user_pkey on user u (cost=0.29..8.30 rows=1 width=1602) (actual time=0.003..0.003 rows=1 loops=4)

    Index Cond: (user_id = 5392)
    Buffers: shared hit=12

Planning time: 0.754 ms
Execution time: 22.680 ms

--(3)
EXPLAIN (ANALYZE, BUFFERS)
SELECT *
FROM transaction t
JOIN user u ON t.user_id = u.id
WHERE transaction_id = ANY (VALUES('p1605170001BC33752830'),('p160519000129B89658337'),('p1603280001EE231575962'),...和其它996个...,('p1604190001F9188378917'))
AND t.user_id = 5392;

Nested Loop (cost=15.85..1749.26 rows=595 width=2064) (actual time=2.168..19.464 rows=4 loops=1)
Buffers: shared hit=5018
-> Index Scan using user_pkey on user u (cost=0.29..8.30 rows=1 width=1602) (actual time=0.006..0.006 rows=1 loops=1)

    Index Cond: (user_id = 5392)
    Buffers: shared hit=3

-> Nested Loop (cost=15.56..1735.00 rows=595 width=462) (actual time=2.160..19.454 rows=4 loops=1)

    Buffers: shared hit=5015
    ->  HashAggregate  (cost=15.00..17.00 rows=200 width=32) (actual time=0.424..0.572 rows=1000 loops=1)
          Group Key: "*VALUES*".column1
          ->  Values Scan on "*VALUES*"  (cost=0.00..12.50 rows=1000 width=32) (actual time=0.003..0.124 rows=1000 loops=1)
    ->  Index Scan using uk_transaction on transaction t  (cost=0.56..8.58 rows=1 width=462) (actual time=0.019..0.019 rows=0 loops=1000)
          Index Cond: ((transaction_id)::text = "*VALUES*".column1)
          Filter: (user_id = 5392)
          Rows Removed by Filter: 1
          Buffers: shared hit=5015

Planning time: 0.943 ms
Execution time: 19.581 ms

其中（1）和（2）的执行计划是一模一样的。查询时间多次执行后稳定在30ms，（3）就是那篇博客推荐的方法，用时稳定在27ms。只有3ms的提升。其实两个查询的执行情况，都是很不错的。我奇怪的是，那篇博客中用了20几秒的查询，为啥我模拟不出来呢？

修改一行代码提升 Postgres 性能 100 倍？

相关电子书

相关实验场景

修改一行代码提升 Postgres 性能 100 倍？

相关文章

相关电子书

相关实验场景