PostgreSQL Heap Only Tuple - HOT (降低UPDATE引入的索引写IO放大)

本文涉及的产品
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
云数据库 RDS SQL Server,基础系列 2核4GB
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
简介: 标签PostgreSQL , Heap Only Tuple , HOT背景PostgreSQL目前默认的存储引擎在更新记录时,会在堆内产生一条新版本,旧版本在不需要使用后VACUUM回收,回收旧版本前,需要先回收所有关联这个版本的所有索引POINT。

标签

PostgreSQL , Heap Only Tuple , HOT


背景

PostgreSQL目前默认的存储引擎在更新记录时,会在堆内产生一条新版本,旧版本在不需要使用后VACUUM回收,回收旧版本前,需要先回收所有关联这个版本的所有索引POINT。

PG的索引的KEY为索引字段或表达式的值,VALUE为行号。

8.3以前,每个TUPLE版本(行号)都有对应的索引POINT,因此更新的放大比较大。

8.3开始,引入了HOT的概念,当更新记录时,如果能满足两个条件时,通过HEAP PAGE内部LINK来串起所有TUPLE版本,因此索引不变。

HOT必须满足如下两个条件:

Necessary Condition A: UPDATE does not change any of the index keys    
  
Necessary Condition B: The new version should fit in the same old block – HOT chains can not cross block boundary.   
  
1、索引字段的值不变。(其中任意一个索引字段的值发生了变化,则所有索引都需要新增版本)  
  
2、新的版本与旧的版本在同一个HEAP PAGE中。  

10以后,可以使用二级索引来解决更新引入的索引放大问题:

《PostgreSQL 10.0 preview 性能增强 - 间接索引(secondary index)》

HOT 实例解说

1、创建测试表,写入10条测试数据

postgres=# create table a(id int, c1 int, c2 int, c3 int);  
CREATE TABLE  
postgres=# insert into a select generate_series(1,10), random()*100, random()*100, random()*100;  
INSERT 0 10  

2、创建索引

postgres=# create index idx_a_1 on a (id);  
CREATE INDEX  
postgres=# create index idx_a_2 on a (c1);  
CREATE INDEX  
postgres=# create index idx_a_3 on a (c2);  
CREATE INDEX  

3、通过pageinspect插件观察索引页内容

postgres=# SELECT * FROM bt_metap('idx_a_1');  
 magic  | version | root | level | fastroot | fastlevel   
--------+---------+------+-------+----------+-----------  
 340322 |       2 |    1 |     0 |        1 |         0  
(1 row)  
  
postgres=# SELECT * FROM bt_metap('idx_a_2');  
 magic  | version | root | level | fastroot | fastlevel   
--------+---------+------+-------+----------+-----------  
 340322 |       2 |    1 |     0 |        1 |         0  
(1 row)  
  
postgres=# SELECT * FROM bt_metap('idx_a_3');  
 magic  | version | root | level | fastroot | fastlevel   
--------+---------+------+-------+----------+-----------  
 340322 |       2 |    1 |     0 |        1 |         0  
(1 row)  
postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,1)  |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          6 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          8 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
          9 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         10 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 14 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          6 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          8 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
          9 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(10 rows)  

4、HOT更新,(更新的字段上没有索引,并且新的版本记录在同一个HEAP PAGE上)

postgres=# update a set c3=c3+1 where id=1;  
UPDATE 1  
postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,1)  |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          6 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          8 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
          9 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         10 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 14 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          6 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          8 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
          9 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(10 rows)  
postgres=# update a set c3=c3 where id=1 returning ctid,*;  
  ctid  | id | c1 | c2 | c3   
--------+----+----+----+----  
 (0,13) |  1 | 13 | 20 | 15  
(1 row)  
  
UPDATE 1  
postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,1)  |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          6 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          8 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
          9 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         10 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 14 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          6 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          8 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
          9 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(10 rows)  
postgres=# update a set c3=c3 where id=1;  
UPDATE 1  
postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,1)  |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          6 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          8 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
          9 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         10 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 14 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          6 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          8 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
          9 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(10 rows)  

5、HOT更新,有索引的字段被更新,但是值不变。并且新版本在同一个PAGE里。

postgres=# update a set c2=c2 where id=1 returning ctid,*;  
  ctid  | id | c1 | c2 | c3   
--------+----+----+----+----  
 (0,14) |  1 | 13 | 20 | 15  
(1 row)  
  
UPDATE 1  
postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,1)  |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          6 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          8 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
          9 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         10 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 14 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          6 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          8 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
          9 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(10 rows)  

以上HOT更新,所有索引都没有发生变化。

6、NON-HOT更新,更新了索引字段的值,所有索引都发生了变化,至少发生了3个索引IO。

postgres=# update a set c2=c2+1 where id=1 returning ctid,*;  
  ctid  | id | c1 | c2 | c3   
--------+----+----+----+----  
 (0,15) |  1 | 13 | 21 | 15  
(1 row)  
  
UPDATE 1  
postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,15) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,1)  |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          3 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          4 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          6 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          8 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          9 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
         10 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         11 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(11 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,15) |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          5 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          6 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          8 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          9 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         11 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(11 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,1)  |      16 | f     | f    | 14 00 00 00 00 00 00 00  
          5 | (0,15) |      16 | f     | f    | 15 00 00 00 00 00 00 00  
          6 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          7 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          8 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
         10 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         11 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(11 rows)  

7、垃圾回收,首先回收索引垃圾版本,最后回收表的垃圾版本。

postgres=# vacuum verbose a;  
INFO:  vacuuming "public.a"  
INFO:  scanned index "idx_a_1" to remove 1 row versions  
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s  
INFO:  scanned index "idx_a_2" to remove 1 row versions  
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s  
INFO:  scanned index "idx_a_3" to remove 1 row versions  
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s  
INFO:  "a": removed 1 row versions in 1 pages  
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s  
INFO:  index "idx_a_1" now contains 10 row versions in 2 pages  
DETAIL:  1 index row versions were removed.  
0 index pages have been deleted, 0 are currently reusable.  
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.  
INFO:  index "idx_a_2" now contains 10 row versions in 2 pages  
DETAIL:  1 index row versions were removed.  
0 index pages have been deleted, 0 are currently reusable.  
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.  
INFO:  index "idx_a_3" now contains 10 row versions in 2 pages  
DETAIL:  1 index row versions were removed.  
0 index pages have been deleted, 0 are currently reusable.  
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.  
INFO:  "a": found 5 removable, 10 nonremovable row versions in 1 out of 1 pages  
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 716311280  
There were 4 unused item pointers.  
Skipped 0 pages due to buffer pins, 0 frozen pages.  
0 pages are entirely empty.  
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.  
VACUUM  

8、垃圾回收后,索引的垃圾版本被清除。

postgres=# SELECT * FROM bt_page_items('idx_a_1',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,15) |      16 | f     | f    | 01 00 00 00 00 00 00 00  
          2 | (0,2)  |      16 | f     | f    | 02 00 00 00 00 00 00 00  
          3 | (0,3)  |      16 | f     | f    | 03 00 00 00 00 00 00 00  
          4 | (0,4)  |      16 | f     | f    | 04 00 00 00 00 00 00 00  
          5 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          6 | (0,6)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          7 | (0,7)  |      16 | f     | f    | 07 00 00 00 00 00 00 00  
          8 | (0,8)  |      16 | f     | f    | 08 00 00 00 00 00 00 00  
          9 | (0,9)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
         10 | (0,10) |      16 | f     | f    | 0a 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_2',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,5)  |      16 | f     | f    | 05 00 00 00 00 00 00 00  
          2 | (0,7)  |      16 | f     | f    | 06 00 00 00 00 00 00 00  
          3 | (0,15) |      16 | f     | f    | 0d 00 00 00 00 00 00 00  
          4 | (0,9)  |      16 | f     | f    | 0e 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 20 00 00 00 00 00 00 00  
          6 | (0,3)  |      16 | f     | f    | 24 00 00 00 00 00 00 00  
          7 | (0,6)  |      16 | f     | f    | 28 00 00 00 00 00 00 00  
          8 | (0,10) |      16 | f     | f    | 2a 00 00 00 00 00 00 00  
          9 | (0,2)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00  
         10 | (0,8)  |      16 | f     | f    | 55 00 00 00 00 00 00 00  
(10 rows)  
  
postgres=# SELECT * FROM bt_page_items('idx_a_3',1);  
 itemoffset |  ctid  | itemlen | nulls | vars |          data             
------------+--------+---------+-------+------+-------------------------  
          1 | (0,6)  |      16 | f     | f    | 09 00 00 00 00 00 00 00  
          2 | (0,5)  |      16 | f     | f    | 12 00 00 00 00 00 00 00  
          3 | (0,8)  |      16 | f     | f    | 13 00 00 00 00 00 00 00  
          4 | (0,15) |      16 | f     | f    | 15 00 00 00 00 00 00 00  
          5 | (0,4)  |      16 | f     | f    | 22 00 00 00 00 00 00 00  
          6 | (0,10) |      16 | f     | f    | 2b 00 00 00 00 00 00 00  
          7 | (0,3)  |      16 | f     | f    | 30 00 00 00 00 00 00 00  
          8 | (0,9)  |      16 | f     | f    | 33 00 00 00 00 00 00 00  
          9 | (0,7)  |      16 | f     | f    | 46 00 00 00 00 00 00 00  
         10 | (0,2)  |      16 | f     | f    | 4b 00 00 00 00 00 00 00  
(10 rows)  

性能测试

1、创建测试表32个索引。

do language plpgsql $$  
declare  
  sql text;  
begin  
  sql := 'create table a (id int primary key,';  
  for i in 1..32 loop  
    sql := sql||'c'||i||' int default random()*1000,';  
  end loop;  
  sql := rtrim(sql,',');  
  sql := sql||') with (fillfactor=80)';   
  execute sql;  
  for i in 2..32 loop  
    execute 'create index idx_a_c'||i||' on a (c'||i||')';  
  end loop;  
end;  
$$;  

2、写入1000万记录

insert into a (id) select generate_series(1,10000000);  

3、non-hot更新

vi test_non_hot.sql  
\set id random(1,10000000)  
update a set c2=c2+random()*100-100 where id=:id;  

4、HOT更新

vi test_hot1.sql  
\set id random(1,10000000)  
update a set c1=c1+random()*100-100 where id=:id;  
  
vi test_hot2.sql  
\set id random(1,10000000)  
update a set c2=c2 where id=:id;  

5、性能对比

5.1、HOT

pgbench -M prepared -n -r -P 1 -f ./test_hot1.sql -c 28 -j 28 -T 120  
  
transaction type: ./test_hot1.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 9139010  
latency average = 0.368 ms  
latency stddev = 0.187 ms  
tps = 76157.798606 (including connections establishing)  
tps = 76174.469712 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.001  \set id random(1,10000000)  
         0.366  update a set c1=c1+random()*100-100 where id=:id;  
  
Total DISK READ :       0.00 B/s | Total DISK WRITE :      13.14 M/s  
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:      13.82 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
45828 be/4 postgres    0.00 B/s   23.39 K/s  0.00 %  1.06 % postgres: postgres postgres 127.0.0.1(41326) UPDATE  
45810 be/4 postgres    0.00 B/s   23.39 K/s  0.00 %  0.99 % postgres: postgres postgres 127.0.0.1(41290) UPDATE  
45821 be/4 postgres    0.00 B/s   23.39 K/s  0.00 %  0.98 % postgres: postgres postgres 127.0.0.1(41312) idle    
45820 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.96 % postgres: postgres postgres 127.0.0.1(41310) UPDATE  
45822 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.96 % postgres: postgres postgres 127.0.0.1(41314) UPDATE  
45819 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.94 % postgres: postgres postgres 127.0.0.1(41308) UPDATE  
45806 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.92 % postgres: postgres postgres 127.0.0.1(41282) idle    
45824 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.90 % postgres: postgres postgres 127.0.0.1(41318) UPDATE  
45827 be/4 postgres    0.00 B/s   54.58 K/s  0.00 %  0.89 % postgres: postgres postgres 127.0.0.1(41324) UPDATE  
45814 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.89 % postgres: postgres postgres 127.0.0.1(41298) UPDATE  
45818 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.88 % postgres: postgres postgres 127.0.0.1(41306) idle    
45823 be/4 postgres    0.00 B/s   15.59 K/s  0.00 %  0.87 % postgres: postgres postgres 127.0.0.1(41316) UPDATE  
45805 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.87 % postgres: postgres postgres 127.0.0.1(41280) UPDATE  
45826 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.86 % postgres: postgres postgres 127.0.0.1(41322) UPDATE  
45809 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.86 % postgres: postgres postgres 127.0.0.1(41288) UPDATE  
45808 be/4 postgres    0.00 B/s   23.39 K/s  0.00 %  0.85 % postgres: postgres postgres 127.0.0.1(41286) UPDATE  
45825 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.85 % postgres: postgres postgres 127.0.0.1(41320) UPDATE  
45804 be/4 postgres    0.00 B/s   23.39 K/s  0.00 %  0.84 % postgres: postgres postgres 127.0.0.1(41278) UPDATE  
49040 be/4 postgres    0.00 B/s   12.85 M/s  0.00 %  0.84 % postgres: wal writer process  
45816 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.66 % postgres: postgres postgres 127.0.0.1(41302) idle    
45829 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.64 % postgres: postgres postgres 127.0.0.1(41328) UPDATE  
45803 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.62 % postgres: postgres postgres 127.0.0.1(41276) BIND    
45795 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.59 % postgres: postgres postgres 127.0.0.1(41274) UPDATE  
45807 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.58 % postgres: postgres postgres 127.0.0.1(41284) UPDATE  
45812 be/4 postgres    0.00 B/s   15.59 K/s  0.00 %  0.56 % postgres: postgres postgres 127.0.0.1(41294) UPDATE  
45811 be/4 postgres    0.00 B/s   15.59 K/s  0.00 %  0.55 % postgres: postgres postgres 127.0.0.1(41292) UPDATE  
45817 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.55 % postgres: postgres postgres 127.0.0.1(41304) UPDATE  
45815 be/4 postgres    0.00 B/s    7.80 K/s  0.00 %  0.55 % postgres: postgres postgres 127.0.0.1(41300) UPDATE  
45813 be/4 postgres    0.00 B/s    0.00 B/s  0.00 %  0.51 % postgres: postgres postgres 127.0.0.1(41296) UPDATE  

5.2、NON-HOT

pgbench -M prepared -n -r -P 1 -f ./test_non_hot.sql -c 28 -j 28 -T 120  
  
transaction type: ./test_non_hot.sql  
scaling factor: 1  
query mode: prepared  
number of clients: 28  
number of threads: 28  
duration: 120 s  
number of transactions actually processed: 6472445  
latency average = 0.519 ms  
latency stddev = 0.707 ms  
tps = 53922.273197 (including connections establishing)  
tps = 53933.908671 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set id random(1,10000000)  
         0.517  update a set c2=c2+random()*100-100 where id=:id;  
  
Total DISK READ :       0.00 B/s | Total DISK WRITE :     191.66 M/s  
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:     142.11 M/s  
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                         
49040 be/4 postgres    0.00 B/s  136.56 M/s  0.00 %  7.17 % postgres: wal writer process  
45997 be/4 postgres    0.00 B/s 1918.73 K/s  0.00 %  1.60 % postgres: postgres postgres 127.0.0.1(41384) BIND    
45983 be/4 postgres    0.00 B/s 1903.25 K/s  0.00 %  1.56 % postgres: postgres postgres 127.0.0.1(41356) UPDATE  
45977 be/4 postgres    0.00 B/s 1829.75 K/s  0.00 %  1.54 % postgres: postgres postgres 127.0.0.1(41344) UPDATE  
45984 be/4 postgres    0.00 B/s 1918.73 K/s  0.00 %  1.53 % postgres: postgres postgres 127.0.0.1(41358) UPDATE  
45985 be/4 postgres    0.00 B/s 1616.99 K/s  0.00 %  1.50 % postgres: postgres postgres 127.0.0.1(41360) UPDATE  
45986 be/4 postgres    0.00 B/s 1748.52 K/s  0.00 %  1.49 % postgres: postgres postgres 127.0.0.1(41362) UPDATE  
45995 be/4 postgres    0.00 B/s 1616.99 K/s  0.00 %  1.47 % postgres: postgres postgres 127.0.0.1(41380) UPDATE  
45988 be/4 postgres    0.00 B/s 1910.99 K/s  0.00 %  1.46 % postgres: postgres postgres 127.0.0.1(41366) UPDATE  
45979 be/4 postgres    0.00 B/s 1763.99 K/s  0.00 %  1.46 % postgres: postgres postgres 127.0.0.1(41348) UPDATE  
45976 be/4 postgres    0.00 B/s 1918.73 K/s  0.00 %  1.45 % postgres: postgres postgres 127.0.0.1(41342) UPDATE  
45982 be/4 postgres    0.00 B/s 1887.78 K/s  0.00 %  1.43 % postgres: postgres postgres 127.0.0.1(41354) UPDATE  
45987 be/4 postgres    0.00 B/s 2019.31 K/s  0.00 %  1.42 % postgres: postgres postgres 127.0.0.1(41364) UPDATE  
45992 be/4 postgres    0.00 B/s 1616.99 K/s  0.00 %  1.40 % postgres: postgres postgres 127.0.0.1(41374) UPDATE  
45994 be/4 postgres    0.00 B/s 1941.94 K/s  0.00 %  1.38 % postgres: postgres postgres 127.0.0.1(41378) UPDATE  
45990 be/4 postgres    0.00 B/s 1794.94 K/s  0.00 %  1.36 % postgres: postgres postgres 127.0.0.1(41370) UPDATE  
45975 be/4 postgres    0.00 B/s 1934.20 K/s  0.00 %  1.35 % postgres: postgres postgres 127.0.0.1(41340) UPDATE  
45974 be/4 postgres    0.00 B/s 1910.99 K/s  0.00 %  1.35 % postgres: postgres postgres 127.0.0.1(41338) UPDATE  
45980 be/4 postgres    0.00 B/s 2003.83 K/s  0.00 %  1.28 % postgres: postgres postgres 127.0.0.1(41350) UPDATE  
45991 be/4 postgres    0.00 B/s 1748.52 K/s  0.00 %  1.13 % postgres: postgres postgres 127.0.0.1(41372) UPDATE  
45996 be/4 postgres    0.00 B/s 2003.83 K/s  0.00 %  1.04 % postgres: postgres postgres 127.0.0.1(41382) UPDATE  
45993 be/4 postgres    0.00 B/s 1918.73 K/s  0.00 %  1.02 % postgres: postgres postgres 127.0.0.1(41376) UPDATE  
45972 be/4 postgres    0.00 B/s 1903.25 K/s  0.00 %  1.02 % postgres: postgres postgres 127.0.0.1(41334) UPDATE  
45978 be/4 postgres    0.00 B/s 1740.78 K/s  0.00 %  1.02 % postgres: postgres postgres 127.0.0.1(41346) UPDATE  
45998 be/4 postgres    0.00 B/s 1841.36 K/s  0.00 %  1.02 % postgres: postgres postgres 127.0.0.1(41386) UPDATE  
45981 be/4 postgres    0.00 B/s 1818.15 K/s  0.00 %  1.01 % postgres: postgres postgres 127.0.0.1(41352) UPDATE  
45989 be/4 postgres    0.00 B/s 1895.52 K/s  0.00 %  1.01 % postgres: postgres postgres 127.0.0.1(41368) UPDATE  
45973 be/4 postgres    0.00 B/s 1941.94 K/s  0.00 %  0.99 % postgres: postgres postgres 127.0.0.1(41336) idle    
45961 be/4 postgres    0.00 B/s 1872.31 K/s  0.00 %  0.96 % postgres: postgres postgres 127.0.0.1(41332) UPDATE  
49039 be/4 postgres    0.00 B/s    4.37 M/s  0.00 %  0.00 % postgres: writer process  
49036 be/4 postgres    0.00 B/s    3.87 K/s  0.00 %  0.00 % postgres: logger process           

使用HOT技术,使得TPS从 53922 提升到了 76157 。IO资源消耗从 192MB/s 降低到了 14MB/s 。

小结

HOT、二级索引、zheap存储引擎,都可以解决更新引入的索引放大问题。

本文介绍了HOT,HOT必须满足如下两个条件:

Necessary Condition A: UPDATE does not change any of the index keys    
  
Necessary Condition B: The new version should fit in the same old block – HOT chains can not cross block boundary.   
  
1、索引字段的值不变。(其中任意一个索引字段的值发生了变化,则所有索引都需要新增版本)  
  
2、新的版本与旧的版本在同一个HEAP PAGE中。  

参考

《PostgreSQL 11 preview - Surjective indexes - 索引HOT增强(表达式)update评估》

HOT Inside - The Technical Architecture

《PostgreSQL 10.0 preview 性能增强 - 间接索引(secondary index)》

src/backend/access/heap/README.HOT

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
SQL 监控 关系型数据库
【巡检问题分析与最佳实践】RDS PostgreSQL 实例IO高问题
实例的磁盘IO负载是RDS PostgreSQL用户日常应重点关注的监控项之一,如果磁盘IO压力过大,很容易导致数据库性能问题。
【巡检问题分析与最佳实践】RDS PostgreSQL 实例IO高问题
|
7月前
|
SQL 关系型数据库 MySQL
PostgreSQL【异常 01】java.io.IOException:Tried to send an out-of-range integer as a 2-byte value 分析+解决
PostgreSQL【异常 01】java.io.IOException:Tried to send an out-of-range integer as a 2-byte value 分析+解决
515 1
|
存储 算法 大数据
倚天性能优化--基于倚天优化后的zstd在大数据场景应用:降低存储成本+提升重IO场景性能
倚天性能优化--基于倚天优化后的zstd在大数据场景应用:降低存储成本+提升重IO场景性能
|
缓存 监控 关系型数据库
[译]PostgreSQL16-新特性-新增IO统计视图:pg_stat_io
[译]PostgreSQL16-新特性-新增IO统计视图:pg_stat_io
244 0
|
关系型数据库 PostgreSQL 索引
PostgreSQL通过索引获取heap tuple解析
PostgreSQL通过索引获取heap tuple解析
183 0
|
SQL JSON 关系型数据库
扩展我们的分析处理服务(Smartly.io):使用 Citus 对 PostgreSQL 数据库进行分片
扩展我们的分析处理服务(Smartly.io):使用 Citus 对 PostgreSQL 数据库进行分片
329 0
扩展我们的分析处理服务(Smartly.io):使用 Citus 对 PostgreSQL 数据库进行分片
|
存储 传感器 关系型数据库
【重新发现PostgreSQL之美】- 8 轨迹业务IO杀手克星index include(覆盖索引)
大家好 ,这里是重新发现PostgreSQL之美 - 8 轨迹业务IO杀手克星index include(覆盖索引)
|
存储 弹性计算 关系型数据库
PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)
标签 PostgreSQL , vacuum_cleanup_index_scale_factor , 索引vacuum 背景 在执行vacuum时,有一个cleanup阶段,以往,不管这个阶段是否需要清理PAGE,只要表上面有索引,就需要对这个表的索引全部扫描一遍。 今天,PG 11版本,增加了一个GUC参数vacuum_cleanup_index_scale_factor,以及bt
451 0
|
存储 关系型数据库 Go
PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)
PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)
1273 0
|
4月前
|
存储 Java
【IO面试题 四】、介绍一下Java的序列化与反序列化
Java的序列化与反序列化允许对象通过实现Serializable接口转换成字节序列并存储或传输,之后可以通过ObjectInputStream和ObjectOutputStream的方法将这些字节序列恢复成对象。

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版