PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)-阿里云开发者社区

开发者社区> 阿里云数据库> 正文

PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)

简介: PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)

背景

在执行vacuum时,有一个cleanup阶段,以往,不管这个阶段是否需要清理PAGE,只要表上面有索引,就需要对这个表的索引全部扫描一遍。

今天,PG 11版本,增加了一个GUC参数vacuum_cleanup_index_scale_factor,以及btree索引参数vacuum_cleanup_index_scale_factor。在btree index meta page上存储了当前表有多少条记录(仅仅当vacuum时发现整个表没有dead tuples(取自pg_stat_all_tables这种统计计数器)时更新meta page),当

1、((表上的 pg_stat_all_tables insert 计数器 - meta page)/meta page) 大于 vacuum_cleanup_index_scale_factor 时,vacuum cleanup阶段才需要SCAN INDEX,更新INDEX stats信息(包括meta page计数器信息)。

2、有deleted pages that can be recycled during cleanup需要清理时,必定要scan index pages.

因此,对于大量INSERT,没有UPDATE,DELETE操作的表的VACUUM,或者常规静态表的VACUUM会快很多,因为不需要SCAN INDEX了。

背景技术

vacuum_cleanup_index_scale_factor (floating point)

Specifies the fraction of the total number of heap tuples counted in the previous statistics collection     
that can be inserted     
without incurring an index scan at the VACUUM cleanup stage.     
This setting currently applies to B-tree indexes only.    
    
If no tuples were deleted from the heap,     
B-tree indexes are still scanned at the VACUUM cleanup stage     
when at least one of the following conditions is met:     
1、the index statistics are stale,     
2、or the index contains deleted pages that can be recycled during cleanup.     
Index statistics are considered to be stale if     
the number of newly inserted tuples     
exceeds the vacuum_cleanup_index_scale_factor fraction of the total number of heap tuples detected by the previous statistics collection.     
    
The total number of heap tuples is stored in the index meta-page.     
Note that the meta-page does not include this data until VACUUM finds no dead tuples,     
so B-tree index scan at the cleanup stage can only be skipped     
if the second and subsequent VACUUM cycles detect no dead tuples.  (典型的insert only场景,或者vacuum干掉所有dead tuple后)    
    
    
The value can range from 0 to 10000000000.     
When vacuum_cleanup_index_scale_factor is set to 0,     
index scans are never skipped during VACUUM cleanup.     
The default value is 0.1.    

B-tree indexes additionally accept this parameter:

vacuum_cleanup_index_scale_factor

Per-index value for vacuum_cleanup_index_scale_factor.

相关代码

src/backend/access/nbtree/nbtree.c

/*    
 * _bt_vacuum_needs_cleanup() -- Checks if index needs cleanup assuming that    
 *                      btbulkdelete() wasn't called.    
 */    
static bool    
_bt_vacuum_needs_cleanup(IndexVacuumInfo *info)    
{    
.... ....    
    
        {    
                StdRdOptions *relopts;    
                float8          cleanup_scale_factor;    
                float8          prev_num_heap_tuples;    
    
                /*    
                 * If table receives enough insertions and no cleanup was performed,    
                 * then index would appear have stale statistics.  If scale factor is    
                 * set, we avoid that by performing cleanup if the number of inserted    
                 * tuples exceeds vacuum_cleanup_index_scale_factor fraction of    
                 * original tuples count.    
                 */    
                relopts = (StdRdOptions *) info->index->rd_options;    
                cleanup_scale_factor = (relopts &&    
                                                                relopts->vacuum_cleanup_index_scale_factor >= 0)    
                        ? relopts->vacuum_cleanup_index_scale_factor    
                        : vacuum_cleanup_index_scale_factor;    
                prev_num_heap_tuples = metad->btm_last_cleanup_num_heap_tuples;    
    
                if (cleanup_scale_factor <= 0 ||    
                        prev_num_heap_tuples < 0 ||    
                        (info->num_heap_tuples - prev_num_heap_tuples) /    
                        prev_num_heap_tuples >= cleanup_scale_factor)     // 是否需要scan index,当判定index为stale状态时,由计数器与vacuum_cleanup_index_scale_factor参数控制。    
                        result = true;    
        }    
/*    
 * Post-VACUUM cleanup.    
 *    
 * Result: a palloc'd struct containing statistical info for VACUUM displays.    
 */    
IndexBulkDeleteResult *    
btvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)    
{    
        /* No-op in ANALYZE ONLY mode */    
        if (info->analyze_only)    
                return stats;    
    
        /*    
         * If btbulkdelete was called, we need not do anything, just return the    
         * stats from the latest btbulkdelete call.  If it wasn't called, we might    
         * still need to do a pass over the index, to recycle any newly-recyclable    
         * pages or to obtain index statistics.  _bt_vacuum_needs_cleanup    
         * determines if either are needed.    
         *    
         * Since we aren't going to actually delete any leaf items, there's no    
         * need to go through all the vacuum-cycle-ID pushups.    
         */    
        if (stats == NULL)    
        {    
                TransactionId oldestBtpoXact;    
    
                /* Check if we need a cleanup */    
                if (!_bt_vacuum_needs_cleanup(info))  // 不需要scan index    
                        return NULL;    
    
                stats = (IndexBulkDeleteResult *) palloc0(sizeof(IndexBulkDeleteResult));    
                btvacuumscan(info, stats, NULL, NULL, 0, &oldestBtpoXact);  // 需要SCAN index    
    
                /* Update cleanup-related information in the metapage */    
                _bt_update_meta_cleanup_info(info->index, oldestBtpoXact,    
                                                                         info->num_heap_tuples);    
        }    
    
        /*    
         * It's quite possible for us to be fooled by concurrent page splits into    
         * double-counting some index tuples, so disbelieve any total that exceeds    
         * the underlying heap's count ... if we know that accurately.  Otherwise    
         * this might just make matters worse.    
         */    
        if (!info->estimated_count)    
        {    
                if (stats->num_index_tuples > info->num_heap_tuples)    
                        stats->num_index_tuples = info->num_heap_tuples;    
        }    
    
        return stats;    
}    

例子

tbl1,每次vacuum都要scan index

create table tbl1 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int);    
create index idx_tbl1_1 on tbl1 (c1) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_2 on tbl1 (c2) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_3 on tbl1 (c3) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_4 on tbl1 (c4) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_5 on tbl1 (c5) with (vacuum_cleanup_index_scale_factor=0);    
create index idx_tbl1_6 on tbl1 (c6) with (vacuum_cleanup_index_scale_factor=0);    

tbl2,当有deleted page需要recycle使用时,或者当((pg_stat_all_tables.inserted-metapage(上一次vacuum有多少条记录))/上一次vacuum有多少条记录) > 100000000 时,才需要scan index。

create table tbl2 (c1 int, c2 int, c3 int, c4 int, c5 int, c6 int);    
create index idx_tbl2_1 on tbl2 (c1) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_2 on tbl2 (c2) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_3 on tbl2 (c3) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_4 on tbl2 (c4) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_5 on tbl2 (c5) with (vacuum_cleanup_index_scale_factor=100000000);    
create index idx_tbl2_6 on tbl2 (c6) with (vacuum_cleanup_index_scale_factor=100000000);    

分别写入1000万记录

insert into tbl1 select id,id,id,id,id,id from generate_series(1,10000000) t(id);    
insert into tbl2 select id,id,id,id,id,id from generate_series(1,10000000) t(id);    

观察两个表的二次VACUUM耗时

\timing    
    
vacuum verbose tbl1;    
vacuum verbose tbl1;    
    
vacuum verbose tbl2;    
vacuum verbose tbl2;    
postgres=# vacuum verbose tbl1;    
INFO:  vacuuming "public.tbl1"    
INFO:  index "idx_tbl1_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl1": found 0 removable, 10000000 nonremovable row versions in 63695 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.75 s, system: 0.00 s, elapsed: 0.76 s.    
VACUUM    
Time: 771.943 ms    
    
    
    
postgres=# vacuum verbose tbl1;    
INFO:  vacuuming "public.tbl1"    
INFO:  index "idx_tbl1_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl1": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.13 s, system: 0.00 s, elapsed: 0.13 s.    
VACUUM    
Time: 141.759 ms    
    
    
    
postgres=# vacuum verbose tbl1;    
INFO:  vacuuming "public.tbl1"    
INFO:  index "idx_tbl1_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl1_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl1": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.12 s, system: 0.00 s, elapsed: 0.13 s.    
VACUUM    
Time: 140.984 ms    
    
    
    
postgres=# vacuum verbose tbl2;    
INFO:  vacuuming "public.tbl2"    
INFO:  index "idx_tbl2_1" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_2" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_3" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_4" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_5" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  index "idx_tbl2_6" now contains 10000000 row versions in 27421 pages    
DETAIL:  0 index row versions were removed.    
0 index pages have been deleted, 0 are currently reusable.    
CPU: user: 0.02 s, system: 0.00 s, elapsed: 0.02 s.    
INFO:  "tbl2": found 0 removable, 10000000 nonremovable row versions in 63695 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.84 s, system: 0.00 s, elapsed: 0.85 s.    
VACUUM    
Time: 860.749 ms    
    
    
    
postgres=# vacuum verbose tbl2;    
INFO:  vacuuming "public.tbl2"    
INFO:  "tbl2": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.    
VACUUM    
Time: 11.895 ms    
    
    
    
postgres=# vacuum verbose tbl2;    
INFO:  vacuuming "public.tbl2"    
INFO:  "tbl2": found 0 removable, 42 nonremovable row versions in 1 out of 63695 pages    
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 1137047265    
There were 0 unused item pointers.    
Skipped 0 pages due to buffer pins, 0 frozen pages.    
0 pages are entirely empty.    
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.    
VACUUM    
Time: 11.944 ms    

分析

tbl1,由于每次vacuum都需要scan index,所以更加耗时。

tbl2,由于设置了vacuum_cleanup_index_scale_factor=100000000,下一次vacuum时insert减去上一次的总记录数再除以上一次总记录数,小于100000000,所以不需要scan index。

耗时来看,相当明显。

通过pageinspect,可以观察到btree索引metapage上的table tuples总数。

postgres=# create extension pageinspect;  
CREATE EXTENSION  
  
postgres=# select * from bt_metap('idx_tbl1_1');  
 magic  | version | root | level | fastroot | fastlevel | oldest_xact | last_cleanup_num_tuples   
--------+---------+------+-------+----------+-----------+-------------+-------------------------  
 340322 |       3 |  412 |     2 |      412 |         2 |           0 |             9.99977e+06  
(1 row)  
  
Time: 0.345 ms  
  
postgres=# select * from bt_metap('idx_tbl2_1');  
 magic  | version | root | level | fastroot | fastlevel | oldest_xact | last_cleanup_num_tuples   
--------+---------+------+-------+----------+-----------+-------------+-------------------------  
 340322 |       3 |  412 |     2 |      412 |         2 |           0 |                   1e+07  
(1 row)  
  
Time: 0.429 ms  

参考

src/backend/access/nbtree/nbtree.c

https://www.postgresql.org/docs/11/sql-createindex.html

https://www.postgresql.org/docs/11/runtime-config-client.html#GUC-VACUUM-CLEANUP-INDEX-SCALE-FACTOR

《PostgreSQL pg_stat_ pg_statio_ 统计信息(scan,read,fetch,hit)源码解读》

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

分享:
阿里云数据库
使用钉钉扫一扫加入圈子
+ 订阅

帮用户承担一切数据库风险,给您何止是安心!

官方博客
链接