【重新发现PostgreSQL之美】- 22 黄帝内经-阿里云开发者社区

开发者社区> 传说中的德哥> 正文

【重新发现PostgreSQL之美】- 22 黄帝内经

简介: 大家好,这里是重新发现PostgreSQL之美 - 22 黄帝内经
+关注继续查看

背景


场景:
电商、社交、SAAS软件行业.
用户表、租户表、用户行为表.

挑战:
按企业、appid等hash分区可能出现倾斜, 例如某些APPID的数据量可能非常大, 热数据可能在同一个分区, saas的某些大企业可能被分到同一分区, 使得数据倾斜.

PG 解决方案:
非对称分区方法.

例子


1、创建1个分区表, 按appid hash分区

appid int,
c1 int,
c2 int,
info text,
ts timestamp
) partition by hash (appid);
create unlogged table t0 partition of t for values with (modulus 4, REMAINDER 0);
create unlogged table t1 partition of t for values with (modulus 4, REMAINDER 1);
create unlogged table t2 partition of t for values with (modulus 4, REMAINDER 2);
create unlogged table t3 partition of t for values with (modulus 4, REMAINDER 3);

2、写入2000万记录, 其中1000万appid=1, 制造倾斜

insert into t select 1, random()*100, random()*1000, md5(random()::text), clock_timestamp() from generate_series(1,10000000);
List of relations
Schema |   Name   |       Type        |  Owner   | Persistence | Access method |  Size   | Description
--------+----------+-------------------+----------+-------------+---------------+---------+-------------
public | t        | partitioned table | postgres | unlogged    |               | 0 bytes |
public | t0       | table             | postgres | unlogged    | heap          | 1014 MB |
public | t1       | table             | postgres | unlogged    | heap          | 189 MB  |
public | t2       | table             | postgres | unlogged    | heap          | 222 MB  |
public | t3       | table             | postgres | unlogged    | heap          | 186 MB  |

3、解决办法, 对于appid=1的分区, 再次按c1 hash分区, 其他分区保持不变.

create unlogged table tt0(like t) PARTITION BY hash (c1);
create unlogged table tt0_0 partition of tt0 for values with (modulus 4, REMAINDER 0);
create unlogged table tt0_1 partition of tt0 for values with (modulus 4, REMAINDER 1);
create unlogged table tt0_2 partition of tt0 for values with (modulus 4, REMAINDER 2);
create unlogged table tt0_3 partition of tt0 for values with (modulus 4, REMAINDER 3);
alter table t DETACH PARTITION t0;
insert into tt0 select * from t0;
alter table t attach partition tt0 FOR VALUES WITH ( MODULUS 4, REMAINDER 0 );
end;
Unlogged partitioned table "public.t"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition key: HASH (appid)
Partitions: t1 FOR VALUES WITH (modulus 4, remainder 1),
t2 FOR VALUES WITH (modulus 4, remainder 2),
t3 FOR VALUES WITH (modulus 4, remainder 3),
tt0 FOR VALUES WITH (modulus 4, remainder 0), PARTITIONED
postgres=# \d+ tt0
Unlogged partitioned table "public.tt0"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition of: t FOR VALUES WITH (modulus 4, remainder 0)
Partition constraint: satisfies_hash_partition('16552'::oid, 4, 0, appid)
Partition key: HASH (c1)
Partitions: tt0_0 FOR VALUES WITH (modulus 4, remainder 0),
tt0_1 FOR VALUES WITH (modulus 4, remainder 1),
tt0_2 FOR VALUES WITH (modulus 4, remainder 2),
tt0_3 FOR VALUES WITH (modulus 4, remainder 3)

可以看到appid=1的数据已经打散

List of relations
Schema |   Name   |       Type        |  Owner   | Persistence | Access method |  Size   | Description
--------+----------+-------------------+----------+-------------+---------------+---------+-------------
public | t        | partitioned table | postgres | unlogged    |               | 0 bytes |
public | t1       | table             | postgres | unlogged    | heap          | 189 MB  |
public | t2       | table             | postgres | unlogged    | heap          | 222 MB  |
public | t3       | table             | postgres | unlogged    | heap          | 186 MB  |
public | tt0      | partitioned table | postgres | unlogged    |               | 0 bytes |
public | tt0_0    | table             | postgres | unlogged    | heap          | 279 MB  |
public | tt0_1    | table             | postgres | unlogged    | heap          | 309 MB  |
public | tt0_2    | table             | postgres | unlogged    | heap          | 254 MB  |
public | tt0_3    | table             | postgres | unlogged    | heap          | 172 MB  |

4、检查执行计划, 分区被正常路由

QUERY PLAN
------------------------------------------------------------
Seq Scan on t2 t  (cost=0.00..62947.58 rows=9965 width=53)
Filter: (appid = 2)
(2 rows)
postgres=# explain select * from t where appid=1;
QUERY PLAN
--------------------------------------------------------------------------
Append  (cost=0.00..337327.48 rows=10031536 width=53)
->  Seq Scan on tt0_0 t_1  (cost=0.00..78943.27 rows=2771208 width=53)
Filter: (appid = 1)
->  Seq Scan on tt0_1 t_2  (cost=0.00..87582.51 rows=3057186 width=53)
Filter: (appid = 1)
->  Seq Scan on tt0_2 t_3  (cost=0.00..71825.61 rows=2499200 width=53)
Filter: (appid = 1)
->  Seq Scan on tt0_3 t_4  (cost=0.00..48818.40 rows=1703942 width=53)
Filter: (appid = 1)
(9 rows)
postgres=# explain select * from t where appid=1 and c1=1;
QUERY PLAN
----------------------------------------------------------------
Seq Scan on tt0_0 t  (cost=0.00..87595.73 rows=99763 width=53)
Filter: ((appid = 1) AND (c1 = 1))
(2 rows)

层次不限、分区方法不限

1、如果t1也有问题, 我们可以继续对t1分区进行二次分区, 选择不同的分区个数

create unlogged table tt1(like t) PARTITION BY hash (c1);
create unlogged table tt1_0 partition of tt1 for values with (modulus 5, REMAINDER 0);
create unlogged table tt1_1 partition of tt1 for values with (modulus 5, REMAINDER 1);
create unlogged table tt1_2 partition of tt1 for values with (modulus 5, REMAINDER 2);
create unlogged table tt1_3 partition of tt1 for values with (modulus 5, REMAINDER 3);
create unlogged table tt1_4 partition of tt1 for values with (modulus 5, REMAINDER 4);
alter table t DETACH PARTITION t1;
insert into tt1 select * from t1;
alter table t attach partition tt1 FOR VALUES WITH ( MODULUS 4, REMAINDER 1 );
end;
Unlogged partitioned table "public.t"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition key: HASH (appid)
Partitions: t2 FOR VALUES WITH (modulus 4, remainder 2),
t3 FOR VALUES WITH (modulus 4, remainder 3),
tt0 FOR VALUES WITH (modulus 4, remainder 0), PARTITIONED,
tt1 FOR VALUES WITH (modulus 4, remainder 1), PARTITIONED
postgres=# \d+ tt1
Unlogged partitioned table "public.tt1"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition of: t FOR VALUES WITH (modulus 4, remainder 1)
Partition constraint: satisfies_hash_partition('16552'::oid, 4, 1, appid)
Partition key: HASH (c1)
Partitions: tt1_0 FOR VALUES WITH (modulus 5, remainder 0),
tt1_1 FOR VALUES WITH (modulus 5, remainder 1),
tt1_2 FOR VALUES WITH (modulus 5, remainder 2),
tt1_3 FOR VALUES WITH (modulus 5, remainder 3),
tt1_4 FOR VALUES WITH (modulus 5, remainder 4)

路由正常

QUERY PLAN
-----------------------------------------------------------------------
Append  (cost=0.00..53418.63 rows=9951 width=53)
->  Seq Scan on tt1_0 t_1  (cost=0.00..12278.30 rows=2289 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_1 t_2  (cost=0.00..8540.11 rows=1593 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_2 t_3  (cost=0.00..10130.17 rows=1886 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_3 t_4  (cost=0.00..9849.98 rows=1837 width=53)
Filter: (appid = 3)
->  Seq Scan on tt1_4 t_5  (cost=0.00..12570.31 rows=2346 width=53)
Filter: (appid = 3)
(11 rows)
postgres=# explain select * from t where appid=3 and c1=2;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on tt1_1 t  (cost=0.00..9476.14 rows=103 width=53)
Filter: ((appid = 3) AND (c1 = 2))
(2 rows)

2、我们也可以对t2使用range或list分区

create unlogged table tt2(like t) PARTITION BY range (ts);
create unlogged table tt2_2021 partition of tt2 for values from ('2021-01-01') to ('2022-01-01');
create unlogged table tt2_2022 partition of tt2 for values from ('2022-01-01') to ('2023-01-01');
alter table t DETACH PARTITION t2;
insert into tt2 select * from t2;
alter table t attach partition tt2 FOR VALUES WITH ( MODULUS 4, REMAINDER 2 );
end;
Unlogged partitioned table "public.t"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition key: HASH (appid)
Partitions: t3 FOR VALUES WITH (modulus 4, remainder 3),
tt0 FOR VALUES WITH (modulus 4, remainder 0), PARTITIONED,
tt1 FOR VALUES WITH (modulus 4, remainder 1), PARTITIONED,
tt2 FOR VALUES WITH (modulus 4, remainder 2), PARTITIONED
postgres=# \d+ tt2
Unlogged partitioned table "public.tt2"
Column |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
appid  | integer                     |           |          |         | plain    |             |              |
c1     | integer                     |           |          |         | plain    |             |              |
c2     | integer                     |           |          |         | plain    |             |              |
info   | text                        |           |          |         | extended |             |              |
ts     | timestamp without time zone |           |          |         | plain    |             |              |
Partition of: t FOR VALUES WITH (modulus 4, remainder 2)
Partition constraint: satisfies_hash_partition('16552'::oid, 4, 2, appid)
Partition key: RANGE (ts)
Partitions: tt2_2021 FOR VALUES FROM ('2021-01-01 00:00:00') TO ('2022-01-01 00:00:00'),
tt2_2022 FOR VALUES FROM ('2022-01-01 00:00:00') TO ('2023-01-01 00:00:00')

使用PG, 支持非对称分区(深度不对称、分区方法不对称、分区数不对称), 彻底解决数据倾斜问题.

除了分, 还能合, 对于某些比较小的分区可以合起来减少分区数. 和黄帝内经的人与自然,五行平衡学说类似, 实则泻之,虚则补之.

参考


《如何修改PostgreSQL分区表分区范围- detach attach - 拆分、合并、非平衡分区表、深度不一致分区表》

《PostgreSQL hash分区表扩容、缩容(增加分区、减少分区、分区重分布、拆分区、合并分区), hash算法hash_any, 混合hash MODULUS 分区- attach , detach》

https://www.postgresql.org/docs/14/sql-createtable.html

 

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
一文快速搞定Redis_数据类型及JavaApi操作
大家好,我是**ChinaManor**,直译过来就是中国码农的意思,我希望自己能成为国家复兴道路的铺路人,大数据领域的耕耘者,平凡但不甘于平庸的人。
9 0
一文快速搞懂系列__一文快速搞懂SuperSet[实战案例]
大家好,我是**ChinaManor**,直译过来就是中国码农的意思,俺希望自己能成为国家复兴道路的铺路人,大数据领域的耕耘者,平凡但不甘于平庸的人。
11 0
偏序关系
偏序 关系 离散数学
6 0
PG+MySQL第9课-实时精准营销
通常业务场景会涉及基于标签条件圈选目标客户、基于用户特征值扩选相似人群、群体用户画像分析这些技术,本文将围绕这三个场景去介绍在实施精准营销里面的PG数据库的使用
8 0
冬季实战营第一期学习报告
通过五天学习与实操,对ECS云服务器入门、快速搭建LAMP环境、部署MySQL数据库、回顾搭建Docker环境和Spring Boot以及使用PolarDB和ECS搭建门户网站操作,对本期学习与实操的认识。
9 0
研发职位到底应该怎么设置?(下)
研发职位到底应该怎么设置?(下)
5 0
冬季实战营第一期:从零到一上手玩转云服务器实验报告
第一期主要进行了六次实验,分别是《动手实操ECS云服务器》、《动手实操快速搭建LAMP环境》、《使用ECS服务器部署MySQL数据库》、《通过workbench远程登录ECS,快速搭建Docker环境》、《从零搭建Spring Boot的Hello World》以及《使用PolarDB和ECS搭建门户网站》。首先远程登陆ECS实例,搭建LAMP环境,其中LAMP分别代表Linux、Apache、MySQL和PHP。然后配置及远程访问MySQL。冯晓帅老师在直播上带大家通过workbench登录ECS并快速搭建Docker环境,运行Spring Boot,最后安装WordPress并搭建博客。
9 0
我的ECS使用体验
在阿里云ECS服务器使用中学习和收获了很多。
11 0
MySQL高可用架构演进
MySQL是数据库领域当之无愧的霸主之一,其在各行各业被广泛应用,随着广泛使用,对于MySQL本身的高可用性的要求就是不可避免的话题,而MySQL的高可用方案也随着MySQL功能的完善经历了多次升级,本文将对MySQL的各种高可用架构进行分析,以此来了解架构的演进。
10 0
172
文章
0
问答
来源圈子
更多
德哥:阿里巴巴高级产品专家, 阿里巴巴钻石布道师, 42项数据库专利, 目前任职于数据库架构组, 负责数据库开源,同时也是PostgreSQL 中国社区发起人之一, 负责PostgreSQL数据库在中国的技术落地与推广、人才培养,开设有开源社、德说、重新发现PG之美等专栏。
+ 订阅
文章排行榜
最热
最新
相关电子书
更多
《2021云上架构与运维峰会演讲合集》
立即下载
《零基础CSS入门教程》
立即下载
《零基础HTML入门教程》
立即下载