PostgreSQL 为什么不要滥用unlogged table & hash index

本文涉及的产品
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
云原生数据库 PolarDB MySQL 版,通用型 2核4GB 50GB
简介:
unlogged table和hash index同样都不会写XLOG,所以如果你用流复制来搞HA,一定概要搞清楚一个问题,切换到备库的话unlogged table数据会被清掉,而hash index也没有,走hash index会失败。

unlogged table 的风险以及修复手段可以见 :
http://blog.163.com/digoal@126/blog/static/163877040201582621345351/

hash index则风险略小,但是也必须重建,但是个人还是建议大家不要使用hash index,改用btree,因为性能确实相差无几。

批量将数据库集群的hash index修改为btree index的方法:
例子:
先并行创建一个btree索引,然后并行删除对应的hash 索引。
$ vi test.sh
#!/bin/bash

for db in `psql -n -q -t -h 127.0.0.1 -p 1921 -U postgres postgres -c "copy (select datname from pg_database where datname <>'template0') to stdout;"`
do 

psql -n -q -t -h 127.0.0.1 -p 1921 -U postgres $db -c "with t1(sql,nsp,idx) as (select regexp_replace(indexdef,'USING hash','USING btree'),schemaname,indexname from pg_indexes where indexdef ~ 'USING hash'), t2(sql_create,sql_drop) as (select regexp_replace(sql,'CREATE INDEX','CREATE INDEX CONCURRENTLY'), 'DROP INDEX CONCURRENTLY '||quote_ident(nsp)||'.'||quote_ident(idx) from t1) select regexp_replace(sql_create,'CONCURRENTLY (.*) ON','CONCURRENTLY \1_0926 ON')  ||'; '|| sql_drop ||'; ' from t2;"|psql -a -e -h 127.0.0.1 -p 1921 -U postgres $db -f -

done

$ . ./test.sh

这个with查询的结果如下举例:
postgres=# with t1(sql,nsp,idx) as (select regexp_replace(indexdef,'USING hash','USING btree'),schemaname,indexname from pg_indexes where indexdef ~ 'USING hash'),
t2(sql_create,sql_drop) as (select regexp_replace(sql,'CREATE INDEX','CREATE INDEX CONCURRENTLY'), 'DROP INDEX CONCURRENTLY '||quote_ident(nsp)||'.'||quote_ident(idx) from t1)
select regexp_replace(sql_create,'CONCURRENTLY (.*) ON','CONCURRENTLY \1_0926 ON')  ||'; '|| sql_drop ||'; ' from t2;
                                            ?column?                                             
-------------------------------------------------------------------------------------------------
 CREATE INDEX CONCURRENTLY hi1_0926 ON t USING btree (id); DROP INDEX CONCURRENTLY public.hi1; 
 CREATE INDEX CONCURRENTLY hi2_0926 ON s1.tbl USING btree (id); DROP INDEX CONCURRENTLY s1.hi2; 
(2 rows)

或者在每个数据库调用这个inline code:
do language plpgsql 
$$

declare
  v_sql text; 
  v_schema name; 
  v_idx name; 
  sql1 text;
begin
for v_sql,v_schema,v_idx in select regexp_replace(indexdef,'USING hash','USING btree'),schemaname,indexname from pg_indexes where indexdef ~ 'USING hash'
loop
  sql1='DROP INDEX '||quote_ident(v_schema)||'.'||quote_ident(v_idx); 
  execute sql1; 
  execute v_sql; 
end loop; 
end; 

$$
; 

postgres=# \d t
       Table "public.t"
 Column |  Type   | Modifiers 
--------+---------+-----------
 id     | integer | not null
Indexes:
    "t_pkey" PRIMARY KEY, btree (id)
    "i1" btree (id)
    "i2" btree (id)

postgres=# \d s1.tbl
        Table "s1.tbl"
 Column |  Type   | Modifiers 
--------+---------+-----------
 id     | integer | 
Indexes:
    "i1" btree (id)

关于hash和btree性能:
查询性能:
postgres=# create table tbl(id int, info text);
CREATE TABLE
postgres=# insert into tbl select generate_series(1,1000000);
INSERT 0 1000000
postgres=# create index idx_tbl1 on tbl using hash (id);

$ vi test.sql
\setrandom id 1 1000000
select * from tbl where id=:id;

postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30
progress: 1.0 s, 24219.3 tps, lat 0.258 ms stddev 0.436
progress: 2.0 s, 29387.1 tps, lat 0.270 ms stddev 0.401
progress: 3.0 s, 29281.0 tps, lat 0.271 ms stddev 0.442
progress: 4.0 s, 29231.7 tps, lat 0.272 ms stddev 0.844
......
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 876806
latency average: 0.270 ms
latency stddev: 0.592 ms
tps = 29202.503991 (including connections establishing)
tps = 29379.956438 (excluding connections establishing)
statement latencies in milliseconds:
        0.003062        \setrandom id 1 1000000
        0.266481        select * from tbl where id=:id;

postgres=# drop index idx_tbl1 ;
DROP INDEX
postgres=# create index idx_tbl1 on tbl using btree (id);
CREATE INDEX

postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30
progress: 1.0 s, 28414.2 tps, lat 0.240 ms stddev 0.306
progress: 2.0 s, 31192.2 tps, lat 0.255 ms stddev 0.605
progress: 3.0 s, 31022.8 tps, lat 0.256 ms stddev 0.451
progress: 4.0 s, 29587.1 tps, lat 0.268 ms stddev 0.671
......
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 903467
latency average: 0.263 ms
latency stddev: 0.678 ms
tps = 30088.054150 (including connections establishing)
tps = 30229.295069 (excluding connections establishing)
statement latencies in milliseconds:
        0.002900        \setrandom id 1 1000000
        0.259402        select * from tbl where id=:id;

更新性能
$ vi test.sql
\setrandom id 1 1000000
update tbl set id=1+:id where id=:id;

postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30
progress: 1.0 s, 12500.2 tps, lat 0.570 ms stddev 0.864
progress: 2.0 s, 17456.9 tps, lat 0.456 ms stddev 0.641
progress: 3.0 s, 18242.3 tps, lat 0.435 ms stddev 0.234
progress: 4.0 s, 17693.0 tps, lat 0.450 ms stddev 0.909
progress: 5.0 s, 17753.3 tps, lat 0.448 ms stddev 0.758
......
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 521331
latency average: 0.456 ms
latency stddev: 0.702 ms
tps = 17372.386945 (including connections establishing)
tps = 17430.542432 (excluding connections establishing)
statement latencies in milliseconds:
        0.003072        \setrandom id 1 1000000
        0.452489        update tbl set id=1+:id where id=:id;

postgres=# drop index idx_tbl1 ;
DROP INDEX
postgres=# create index idx_tbl1 on tbl using hash (id);
CREATE INDEX

postgres@digoal-> pgbench -M prepared -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 30
progress: 1.0 s, 16321.8 tps, lat 0.411 ms stddev 0.521
progress: 2.0 s, 17372.0 tps, lat 0.458 ms stddev 0.409
progress: 3.0 s, 16731.5 tps, lat 0.475 ms stddev 1.094
progress: 4.0 s, 16972.9 tps, lat 0.469 ms stddev 0.880
progress: 5.0 s, 17392.7 tps, lat 0.457 ms stddev 0.607
......
transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 8
number of threads: 8
duration: 30 s
number of transactions actually processed: 527778
latency average: 0.450 ms
latency stddev: 0.678 ms
tps = 17587.300360 (including connections establishing)
tps = 17671.181609 (excluding connections establishing)
statement latencies in milliseconds:
        0.002955        \setrandom id 1 1000000
        0.446435        update tbl set id=1+:id where id=:id;

SIZE
postgres=# create index idx_tbl1 on tbl using hash (id);
CREATE INDEX
postgres=# create index idx_tbl2 on tbl using btree (id);
CREATE INDEX
postgres=# \di+ idx_tbl*
                         List of relations
 Schema |   Name   | Type  |  Owner   | Table | Size  | Description 
--------+----------+-------+----------+-------+-------+-------------
 public | idx_tbl1 | index | postgres | tbl   | 32 MB | 
 public | idx_tbl2 | index | postgres | tbl   | 21 MB | 
(2 rows)

小结:
查询和更新性能相差无几。
大小,btree是hash的2/3。
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
3月前
|
SQL 关系型数据库 数据库
PostgreSQL数据库报错 ERROR: multiple default values specified for column "" of table "" 如何解决?
PostgreSQL数据库报错 ERROR: multiple default values specified for column "" of table "" 如何解决?
364 59
|
SQL 弹性计算 算法
PostgreSQL 普通表在线转换为分区表 - online exchange to partition table
标签 PostgreSQL , 分区表 , 在线转换 背景 非分区表,如何在线(不影响业务)转换为分区表? 方法1,pg_pathman分区插件 《PostgreSQL 9.5+ 高效分区表实现 - pg_pathman》 使用非堵塞式的迁移接口 partition_table_concurrently( relation REGCLASS,
2765 0
|
SQL 存储 Oracle
PostgreSQL 分页, offset, 返回顺序, 扫描方法原理(seqscan, index scan, index only scan, bitmap scan, parallel xx scan),游标
PostgreSQL 分页, offset, 返回顺序, 扫描方法原理(seqscan, index scan, index only scan, bitmap scan, parallel xx scan),游标
3857 0
|
存储 SQL 网络协议
PolarDB for PostgreSQL 采用iprange和SPGiST index 实现超光速 - 全球化业务根据来源IP智能DNS路由就近分配本地机房访问, 提升用户体验
[《如何获得IP地址对应的地理信息库, 实现智能DNS解析? 就近路由提升全球化部署业务的访问性能》](../202211/20221124_09.md) 上一篇信息提到了如何获取IP地址段的地理信息库, 本篇信息将使用PolarDB for PostgreSQL来加速根据来源IP快速找到对应的IP地址段, 将用到PolarDB for PostgreSQL的SPGiST索引和inet数据类型. 相比于把IP地址段存储为2个int8字段作between and的匹配, SPGiST索引和inet数据类型至少可以提升20倍性能.
247 0
|
SQL 存储 弹性计算
PostgreSQL 分页, offset, 返回顺序, 扫描方法原理(seqscan, index scan, index only scan, bitmap scan, parallel xx scan),游标
标签 PostgreSQL , 数据离散性 , 扫描性能 , 重复扫 , bitmap index scan , 排序扫描 , 扫描方法 , 顺序 背景 一个这样的问题: 为什么select x from tbl offset x limit x; 两次查询连续的OFFSET,会有重复数据呢? select ctid,* from tbl where ... offset 0 li
2126 0
|
存储 传感器 关系型数据库
【重新发现PostgreSQL之美】- 8 轨迹业务IO杀手克星index include(覆盖索引)
大家好 ,这里是重新发现PostgreSQL之美 - 8 轨迹业务IO杀手克星index include(覆盖索引)
|
传感器 SQL 并行计算
【重新发现PostgreSQL之美】 - 6 index链表跳跳糖 (CTE recursive 递归的详细用例)
大家好,这里是重新发现PostgreSQL之美 - 6 index链表跳跳糖 (CTE recursive 递归的详细用例)
|
SQL 算法 关系型数据库
PostgreSQL 普通表在线转换为分区表 - online exchange to partition table
PostgreSQL 普通表在线转换为分区表 - online exchange to partition table
2807 0
|
SQL 分布式计算 并行计算
PostgreSQL 并行计算解说 之13 - parallel OLAP : 中间结果 parallel with unlogged table
标签 PostgreSQL , cpu 并行 , smp 并行 , 并行计算 , gpu 并行 , 并行过程支持 背景 PostgreSQL 11 优化器已经支持了非常多场合的并行。简单估计,已支持27余种场景的并行计算。 parallel seq scan parallel
648 0
|
存储 关系型数据库 Go
PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)
PostgreSQL 11 内核优化 - 降低vacuum cleanup阶段index scan概率 ( vacuum_cleanup_index_scale_factor , skip index vacuum cleanup stage)
1272 0