PostgreSQL 9.6 sharding based on FDW & pg_pathman-阿里云开发者社区

开发者社区> 德哥> 正文

PostgreSQL 9.6 sharding based on FDW & pg_pathman

简介:
+关注继续查看

PostgreSQL 9.6 sharding based on FDW & pg_pathman

作者

digoal

日期

2016-10-26

标签

PostgreSQL , 分区表 , pg_pathman , custom scan api , sharding , FDW


背景

可以阅读以下几篇文章先回顾一下FDW,基于FDW的shared以及高效的分区插件pg_pathman。

《PostgreSQL 9.6 单元化,sharding (based on postgres_fdw) - 内核层支持前传》

《PostgreSQL 9.6 sharding + 单元化 (based on postgres_fdw) 最佳实践 - 通用水平分库场景设计与实践》

《PostgreSQL 9.5+ 高效分区表实现 - pg_pathman》

由于pg_pathman支持foreign table,所以拿它来做sharding也是理所当然的事情,同时它支持的HASH partition不需要在QUERY中带入constraint对应的clause,也能走分区查询。

例子

依旧使用《PostgreSQL 9.6 sharding + 单元化 (based on postgres_fdw) 最佳实践 - 通用水平分库场景设计与实践》文章中的例子。

例子使用1个路由节点,4个数据节点。

1. 配置目标库的pg_hba.conf

使用md5认证

2. 创建用户

create role role0 nosuperuser login encrypted password 'pwd' ;
create role role1 nosuperuser login encrypted password 'pwd' ;
create role role2 nosuperuser login encrypted password 'pwd' ;
create role role3 nosuperuser login encrypted password 'pwd' ;
create role digoal nosuperuser login encrypted password 'pwd' ;

3. 创建数据库

create database db0 with template template0 ;
create database db1 with template template0 ;
create database db2 with template template0 ;
create database db3 with template template0 ;
create database mas1 with template template0 ;

4. 配置数据库权限

grant all on database db0 to role0 ;
grant all on database db1 to role1 ;
grant all on database db2 to role2 ;
grant all on database db3 to role3 ;
grant all on database mas1 to digoal ;

5. 创建schema, 建议与USER同名

\c db0 role0
create schema role0;
\c db1 role1
create schema role1;
\c db2 role2
create schema role2;
\c db3 role3
create schema role3;
\c mas1 digoal
create schema digoal;

6. 配置路由节点,需要用到postgres_fdw与pg_pathman插件

\c mas1 postgres
create extension postgres_fdw;
create extension pg_pathman;
grant usage on foreign data wrapper postgres_fdw to digoal;

7. ddl建表语句 , ddl-1.sql

create table userinfo(uid int8 primary key, info text, crt_time timestamp);

create table user_log(uid int8 not null, logintime timestamp, logouttime timestamp, client_addr inet, client_port int);

create table user_membership(uid1 int8 not null, uid2 int8, unique (uid1,uid2) );

create table user_membership_rev(uid2 int8 not null, uid1 int8, unique (uid2,uid1) );

8. 初始化数据节点,创建数据表,注意表名最好全局唯一,方便使用import foreign schema的方式导入。

\c db0 role0

ddl-1.sql

然后执行 :   
do language plpgsql $$
declare
  i int;
  mod int := 1;
  nod int := 0;
  sql text;
begin
  for i in (mod*nod)..(mod*(nod+1)-1) loop
      sql := 'create table userinfo_'||i||'(like userinfo including all)';
      execute sql;
      sql := 'create table user_log_'||i||'(like user_log including all)';
      execute sql;
      sql := 'create table user_membership_'||i||'(like user_membership including all)';
      execute sql;
      sql := 'create table user_membership_rev_'||i||'(like user_membership_rev including all)';
      execute sql;
  end loop;
end;
$$;
\c db1 role1

ddl-1.sql

然后执行 :   
do language plpgsql $$
declare
  i int;
  mod int := 1;
  nod int := 1;
  sql text;
begin
  for i in (mod*nod)..(mod*(nod+1)-1) loop
      sql := 'create table userinfo_'||i||'(like userinfo including all)';
      execute sql;
      sql := 'create table user_log_'||i||'(like user_log including all)';
      execute sql;
      sql := 'create table user_membership_'||i||'(like user_membership including all)';
      execute sql;
      sql := 'create table user_membership_rev_'||i||'(like user_membership_rev including all)';
      execute sql;
  end loop;
end;
$$;
\c db2 role2

ddl-1.sql

然后执行 :   
do language plpgsql $$
declare
  i int;
  mod int := 1;
  nod int := 2;
  sql text;
begin
  for i in (mod*nod)..(mod*(nod+1)-1) loop
      sql := 'create table userinfo_'||i||'(like userinfo including all)';
      execute sql;
      sql := 'create table user_log_'||i||'(like user_log including all)';
      execute sql;
      sql := 'create table user_membership_'||i||'(like user_membership including all)';
      execute sql;
      sql := 'create table user_membership_rev_'||i||'(like user_membership_rev including all)';
      execute sql;
  end loop;
end;
$$;
\c db3 role3

ddl-1.sql

然后执行 :   
do language plpgsql $$
declare
  i int;
  mod int := 1;
  nod int := 3;
  sql text;
begin
  for i in (mod*nod)..(mod*(nod+1)-1) loop
      sql := 'create table userinfo_'||i||'(like userinfo including all)';
      execute sql;
      sql := 'create table user_log_'||i||'(like user_log including all)';
      execute sql;
      sql := 'create table user_membership_'||i||'(like user_membership including all)';
      execute sql;
      sql := 'create table user_membership_rev_'||i||'(like user_membership_rev including all)';
      execute sql;
  end loop;
end;
$$;

9. 初始化master

\c mas1 digoal


-- 初始化foreign server
-- or user host (本例使用unix socket)
create server db0 foreign data wrapper postgres_fdw OPTIONS (host '/u02/digoal/pg_root5281', port '5281', dbname 'db0', use_remote_estimate 'false', fetch_size '5000000');
create server db1 foreign data wrapper postgres_fdw OPTIONS (host '/u02/digoal/pg_root5281', port '5281', dbname 'db1', use_remote_estimate 'false', fetch_size '5000000');
create server db2 foreign data wrapper postgres_fdw OPTIONS (host '/u02/digoal/pg_root5281', port '5281', dbname 'db2', use_remote_estimate 'false', fetch_size '5000000');
create server db3 foreign data wrapper postgres_fdw OPTIONS (host '/u02/digoal/pg_root5281', port '5281', dbname 'db3', use_remote_estimate 'false', fetch_size '5000000');
-- 设置use_remote_estimate 'false' 需收集外部表的统计信息, 但是可以节约explain的操作,对于高并发的小事务,建议使用FALSE


-- 创建user mapping
create user mapping FOR digoal SERVER db0 options (user 'role0', password 'pwd');
create user mapping FOR digoal SERVER db1 options (user 'role1', password 'pwd');
create user mapping FOR digoal SERVER db2 options (user 'role2', password 'pwd');
create user mapping FOR digoal SERVER db3 options (user 'role3', password 'pwd');


-- 导入FOREIGN TABLE,排除主表
import foreign schema role0 EXCEPT (userinfo,user_log,user_membership,mv_user_membership,user_membership_rev,mv_user_membership_rev) from server db0 into digoal;
import foreign schema role1 EXCEPT (userinfo,user_log,user_membership,mv_user_membership,user_membership_rev,mv_user_membership_rev) from server db1 into digoal;
import foreign schema role2 EXCEPT (userinfo,user_log,user_membership,mv_user_membership,user_membership_rev,mv_user_membership_rev) from server db2 into digoal;
import foreign schema role3 EXCEPT (userinfo,user_log,user_membership,mv_user_membership,user_membership_rev,mv_user_membership_rev) from server db3 into digoal;


-- 创建主表
create table userinfo(uid int8 primary key, info text, crt_time timestamp);
create table user_log(uid int8 not null, logintime timestamp, logouttime timestamp, client_addr inet, client_port int);
create table user_membership(uid1 int8 not null, uid2 int8, unique (uid1,uid2) );
create table user_membership_rev(uid2 int8 not null, uid1 int8, unique (uid2,uid1) );


-- 生成哈希分区
select create_hash_partitions('userinfo'::regclass, 'uid', 0, true) ;
select create_hash_partitions('user_log'::regclass, 'uid', 0, true) ;
select create_hash_partitions('user_membership'::regclass, 'uid1', 0, true) ;
select create_hash_partitions('user_membership_rev'::regclass, 'uid2', 0, true) ;


-- 设置foreign table 约束,目前pg_pathman不支持attach hash 分区表,所以我把步骤分解了,先构建约束
-- 注意约束名有固定的格式
do language plpgsql $$
declare
  i int;
  x int := 0;
  y int := 3;
  sql text;
begin
  for i in x..y loop
      sql := 'alter foreign table userinfo_'||i||' add constraint pathman_userinfo_'||i||'_1_check check(get_hash_part_idx(hashint8(uid), 4) = '||i||' )';
      execute sql;
      sql := 'alter foreign table user_log_'||i||' add constraint pathman_user_log_'||i||'_1_check check(get_hash_part_idx(hashint8(uid), 4) = '||i||' )';
      execute sql;
      sql := 'alter foreign table user_membership_'||i||' add constraint pathman_user_membership_'||i||'_1_check check(get_hash_part_idx(hashint8(uid1), 4) = '||i||' )';
      execute sql;
      sql := 'alter foreign table user_membership_rev_'||i||' add constraint pathman_user_membership_rev_'||i||'_1_check check(get_hash_part_idx(hashint8(uid2), 4) = '||i||' )';
      execute sql;
  end loop;
end;
$$;


-- 设置继承关系
do language plpgsql $$
declare
  i int;
  x int := 0;
  y int := 3;
  sql text;
begin
  for i in x..y loop
      sql := 'alter foreign table userinfo_'||i||' inherit userinfo';
      execute sql;
      sql := 'alter foreign table user_log_'||i||' inherit user_log';
      execute sql;
      sql := 'alter foreign table user_membership_'||i||' inherit user_membership';
      execute sql;
      sql := 'alter foreign table user_membership_rev_'||i||' inherit user_membership_rev';
      execute sql;
  end loop;
end;
$$;


-- 前面都准备好之后,可以设置pg_pathman的内存,刷新。
do language plpgsql $$
declare
  i int;
  x int := 0;
  y int := 3;
  tbls text[] := array['userinfo', 'user_log', 'user_membership', 'user_membership_rev'];
  tbl text;
  child text;
  sql text;
begin
  foreach tbl in array tbls loop 
    for i in x..y loop
      child := tbl||'_'||i;
      -- 将子表添加到pg_pathman hash memory中
      perform public.invoke_on_partition_created_callback(tbl::regclass, child::REGCLASS, 0);
    end loop;
    -- 刷新
    perform public.on_create_partitions(tbl);
  end loop;
end;
$$;

10. 插入压测

插入数据测试的函数

create or replace function f(int) returns void as $$
declare
begin 
  insert into userinfo values ($1, md5(random()::Text), now());
 exception when others then return;
end;
$$ language plpgsql strict;

压测

$ vi test1.sql
\set id random(1,50000000)
select f(:id);

$ pgbench -M simple -n -r -P 1 -f ./test1.sql -c 64 -j 64 -T 1200 -U digoal mas1
...
progress: 102.0 s, 42769.1 tps, lat 1.495 ms stddev 0.512
progress: 103.0 s, 42633.0 tps, lat 1.500 ms stddev 0.519
progress: 104.0 s, 42864.0 tps, lat 1.492 ms stddev 0.525
progress: 105.0 s, 42724.0 tps, lat 1.497 ms stddev 0.537
progress: 106.0 s, 42801.3 tps, lat 1.494 ms stddev 0.507
progress: 107.0 s, 42547.7 tps, lat 1.503 ms stddev 0.524
...
   PerfTop:   54084 irqs/sec  kernel:40.6%  exact:  0.0% [1000Hz cycles],  (all, 64 CPUs)
---------------------------------------------------------------------------------------------
             samples  pcnt function                      DSO
             _______ _____ _____________________________ ____________________________________

            18991.00  2.5% AllocSetAlloc                 /home/digoal/pgsql9.6/bin/postgres
            17321.00  2.3% _spin_lock                    [kernel.kallsyms]                   
            15946.00  2.1% GetSnapshotData               /home/digoal/pgsql9.6/bin/postgres
            15049.00  2.0% hash_search_with_hash_value   /home/digoal/pgsql9.6/bin/postgres
            12864.00  1.7% base_yyparse                  /home/digoal/pgsql9.6/bin/postgres
            10796.00  1.4% SearchCatCache                /home/digoal/pgsql9.6/bin/postgres
             9826.00  1.3% memcpy                        /lib64/libc-2.12.so                 
             8840.00  1.2% _int_malloc                   /lib64/libc-2.12.so                 
             8261.00  1.1% __strlen_sse42                /lib64/libc-2.12.so                 
             8245.00  1.1% schedule                      [kernel.kallsyms]                   
             7441.00  1.0% __GI_vfprintf                 /lib64/libc-2.12.so                 
             7348.00  1.0% AtEOXact_GUC                  /home/digoal/pgsql9.6/bin/postgres
             6679.00  0.9% __strcmp_sse42                /lib64/libc-2.12.so                 
             6531.00  0.9% mutex_spin_on_owner           [kernel.kallsyms]                   
             5899.00  0.8% reschedule_interrupt          [kernel.kallsyms]                   

直连测试

\c db0 role0

create or replace function f(int) returns void as $$
declare
begin 
  insert into userinfo values ($1, md5(random()::Text), now());
 exception when others then return;
end;
$$ language plpgsql strict;
$ pgbench -M prepared -n -r -P 1 -f ./test1.sql -c 64 -j 64 -T 1200 -U role0 db0

progress: 6.0 s, 149615.0 tps, lat 0.427 ms stddev 0.389
progress: 7.0 s, 149022.1 tps, lat 0.428 ms stddev 0.398
progress: 8.0 s, 148586.6 tps, lat 0.430 ms stddev 0.430

11. query测试

压测

$ vi test1.sql
\set id random(1,50000000)
select * from userinfo where uid=:id;

$ pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 1200 -U digoal mas1
...
progress: 158.0 s, 46046.8 tps, lat 1.388 ms stddev 0.960
progress: 159.0 s, 46201.2 tps, lat 1.384 ms stddev 0.945
progress: 160.0 s, 46176.9 tps, lat 1.385 ms stddev 0.957
progress: 161.0 s, 46143.0 tps, lat 1.386 ms stddev 0.943
progress: 162.0 s, 46316.0 tps, lat 1.381 ms stddev 0.955
progress: 163.0 s, 46067.1 tps, lat 1.388 ms stddev 0.954
...
   PerfTop:   46994 irqs/sec  kernel:40.7%  exact:  0.0% [1000Hz cycles],  (all, 64 CPUs)
-------------------------------------------------------------------------------------------------
             samples  pcnt function                      DSO
             _______ _____ _____________________________ ________________________________________

            17838.00  2.4% GetSnapshotData               /home/digoal/pgsql9.6/bin/postgres    
            17711.00  2.4% _spin_lock                    [kernel.kallsyms]                       
            17372.00  2.4% AllocSetAlloc                 /home/digoal/pgsql9.6/bin/postgres    
            15205.00  2.1% base_yyparse                  /home/digoal/pgsql9.6/bin/postgres    
            12210.00  1.7% SearchCatCache                /home/digoal/pgsql9.6/bin/postgres    
            11821.00  1.6% hash_search_with_hash_value   /home/digoal/pgsql9.6/bin/postgres    
            10729.00  1.5% __mutex_lock_slowpath         [kernel.kallsyms]                       
            10694.00  1.5% mutex_spin_on_owner           [kernel.kallsyms]                       
            10168.00  1.4% _int_malloc                   /lib64/libc-2.12.so                     
             8217.00  1.1% schedule                      [kernel.kallsyms]                       
             6977.00  1.0% __strlen_sse42                /lib64/libc-2.12.so                     
             6918.00  0.9% __strcmp_sse42                /lib64/libc-2.12.so                     
             6711.00  0.9% _spin_lock_irqsave            [kernel.kallsyms]                       
             6629.00  0.9% reschedule_interrupt          [kernel.kallsyms]                       
             6599.00  0.9% find_busiest_group            [kernel.kallsyms]                       
             5984.00  0.8% core_yylex                    /home/digoal/pgsql9.6/bin/postgres                                   

直连测试

$ vi test.sql
\set id random(1,50000000)
select * from userinfo_0 where uid=:id;

$ pgbench -M prepared -n -r -P 1 -f ./test.sql -c 64 -j 64 -T 1200 -U role0 db0
progress: 3.0 s, 1065981.5 tps, lat 0.059 ms stddev 0.008
progress: 4.0 s, 1084914.1 tps, lat 0.058 ms stddev 0.006
progress: 5.0 s, 1093083.4 tps, lat 0.057 ms stddev 0.006

小结

1. 基于fdw的shard,从功能上来讲还有哪些值得改进的点:

例如, 支持聚合下推(10.0已经支持)

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=f5d6bce63ceb3c59a964814bb0df5a0648e750e5

https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=7012b132d07c2b4ea15b0b3cb1ea9f3278801d98

append的并行化,目前如果要扫描多个分区,是串行的,并不是并行扫描,非常影响效率。

2. 从性能方面来讲,还有哪些值得改进的点:

从profile的结果来看,目前的锁较重,有非常大的性能提升空间,需要花点时间看看代码。

当然如果当前的性能能满足你,或者你通过构建多个对等的master加上负载均衡,也能解决这个问题。

现在与直连单节点的性能差距还是较大的。

Count

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
阿里云服务器怎么设置密码?怎么停机?怎么重启服务器?
如果在创建实例时没有设置密码,或者密码丢失,您可以在控制台上重新设置实例的登录密码。本文仅描述如何在 ECS 管理控制台上修改实例登录密码。
10095 0
使用OpenApi弹性释放和设置云服务器ECS释放
云服务器ECS的一个重要特性就是按需创建资源。您可以在业务高峰期按需弹性的自定义规则进行资源创建,在完成业务计算的时候释放资源。本篇将提供几个Tips帮助您更加容易和自动化的完成云服务器的释放和弹性设置。
12078 0
阿里云服务器如何登录?阿里云服务器的三种登录方法
购买阿里云ECS云服务器后如何登录?场景不同,阿里云优惠总结大概有三种登录方式: 登录到ECS云服务器控制台 在ECS云服务器控制台用户可以更改密码、更换系.
13893 0
windows server 2008阿里云ECS服务器安全设置
最近我们Sinesafe安全公司在为客户使用阿里云ecs服务器做安全的过程中,发现服务器基础安全性都没有做。为了为站长们提供更加有效的安全基础解决方案,我们Sinesafe将对阿里云服务器win2008 系统进行基础安全部署实战过程! 比较重要的几部分 1.
9161 0
腾讯云服务器 设置ngxin + fastdfs +tomcat 开机自启动
在tomcat中新建一个可以启动的 .sh 脚本文件 /usr/local/tomcat7/bin/ export JAVA_HOME=/usr/local/java/jdk7 export PATH=$JAVA_HOME/bin/:$PATH export CLASSPATH=.
4664 0
阿里云服务器ECS登录用户名是什么?系统不同默认账号也不同
阿里云服务器Windows系统默认用户名administrator,Linux镜像服务器用户名root
4510 0
+关注
德哥
公益是一辈子的事, I'm digoal, just do it.
2153
文章
245
问答
来源圈子
更多
阿里云数据库:帮用户承担一切数据库风险,给您何止是安心!支持关系型数据库:MySQL、SQL Server、PostgreSQL、PPAS(完美兼容Oracle)、自研PB级数据存储的分布式数据库Petadata、自研金融级云数据库OceanBase支持NoSQL数据库:MongoDB、Redis、Memcache更有褚霸、丁奇、德哥、彭立勋、玄惭、叶翔等顶尖数据库专家服务。
+ 订阅
文章排行榜
最热
最新
相关电子书
更多
《2021云上架构与运维峰会演讲合集》
立即下载
《零基础CSS入门教程》
立即下载
《零基础HTML入门教程》
立即下载