PostgreSQL 分库分表 插件之一 pg_shard

本文涉及的产品
云原生数据库 PolarDB MySQL 版,Serverless 5000PCU 100GB
云原生数据库 PolarDB 分布式版,标准版 2核8GB
云数据库 RDS MySQL Serverless,0.5-2RCU 50GB
简介:

MySQL的分库分表有非常多的解决方案,PostgreSQL 的分库分表方案也不少。
今天要给大家介绍的是pg_shard插件。
安装很简单,
如果你的GCC版本第一4.6,那么首先要安装一个高版本的GCC,因为pg_shard里面用了gcc 4.6以后新加的特性。

# yum install -y gmp mpfr libmpc libmpc-devel  
  
# wget http://gcc.cybermirror.org/releases/gcc-4.9.3/gcc-4.9.3.tar.bz2  
# tar -jxvf gcc-4.9.3.tar.bz2   
# cd gcc-4.9.3  
# ./configure --prefix=/opt/gcc4.9.3  
# make && make install  
# vi /etc/ld.so.conf  
/opt/gcc4.9.3/lib  
/opt/gcc4.9.3/lib64  
# ldconfig  
# ldconfig -p|grep gcc  
# vi /etc/profile  
export PATH=/opt/gcc4.9.3/bin:$PATH  

安装pg_shard

# git clone https://github.com/citusdata/pg_shard.git  
# cd pg_shard/  

切换到master分支,使用1.2.2版本。

# git checkout master  
commit 7e6103f79e3651eac0b32429f5fb103eb2a8ebdd  
Merge: 2b221d9 ac35076  
Author: Jason Petersen <jason@citusdata.com>  
Date:   Fri Aug 28 19:12:16 2015 -0600  
  
    Merge branch 'release-1.2.2'  
......  

安装:

# . /home/postgres/.bash_profile  
# make clean; make; make install  

假设我的环境中有5个数据库实例,其中一个master,4个worker。

su - postgres  
cd $PGDATA  

在master实例的$PGDATA中编辑一个pg_worker_list.conf 文件。

postgres@digoal-> vi pg_worker_list.conf   
localhost 1922  
localhost 1923  
localhost 1924  
localhost 1925  

同时确保master所在主机,连接work节点数据库不需要密码,或密码已经存放在.pgpass密码文件。

postgres@digoal-> cat $PGDATA1/pg_hba.conf |grep ^local  
local   all             all                                     trust  
postgres@digoal-> cat $PGDATA2/pg_hba.conf |grep ^local  
local   all             all                                     trust  
postgres@digoal-> cat $PGDATA3/pg_hba.conf |grep ^local  
local   all             all                                     trust  
postgres@digoal-> cat $PGDATA4/pg_hba.conf |grep ^local  
local   all             all                                     trust  

在master节点配置pg_shard.

vi $PGDATA/postgresql.conf  
shared_preload_libraries = 'pg_shard'  
pg_ctl restart -m fast  
psql  

为了确保主和worker一致,正常的流程是,在master和所有的worker节点创建一致的:

role  
database  
schema  

在主节点,连接到database,创建pg_shard 扩展模块。

postgres=# create extension pg_shard;  

在主节点创建测试表

postgres=# CREATE TABLE customer_reviews                                               
(  
    customer_id TEXT NOT NULL,  
    review_date DATE,  
    review_rating INTEGER,  
    review_votes INTEGER,  
    review_helpful_votes INTEGER,  
    product_id CHAR(10),  
    product_title TEXT,  
    product_sales_rank BIGINT,  
    product_group TEXT,  
    product_category TEXT,  
    product_subcategory TEXT,  
    similar_product_ids CHAR(10)[]  
);  

创建合适的约束,索引,建议在定义work table前都确定下来,否则以后添加索引要在所有节点手工添加。
在主节点,调用以下函数,构造work table。表名,字段名为分布列。

postgres=# SELECT master_create_distributed_table('customer_reviews', 'customer_id');  

在主节点,调用以下函数,在子节点创建work table,16个分片,每个分片保存2份。

postgres=# SELECT master_create_worker_shards('customer_reviews', 16, 2);  

在主节点,可以看到元数据。

postgres=# \dt pgs_distribution_metadata.*  
                       List of relations  
          Schema           |      Name       | Type  |  Owner     
---------------------------+-----------------+-------+----------  
 pgs_distribution_metadata | partition       | table | postgres  
 pgs_distribution_metadata | shard           | table | postgres  
 pgs_distribution_metadata | shard_placement | table | postgres  
(3 rows)  
  
postgres=# select * from pgs_distribution_metadata.partition ;  
 relation_id | partition_method |     key       
-------------+------------------+-------------  
       42067 | h                | customer_id  
(1 row)  
postgres=# select 42067::regclass;  
     regclass       
------------------  
 customer_reviews  
(1 row)  
  
postgres=# select * from pgs_distribution_metadata.shard;  
  id   | relation_id | storage |  min_value  |  max_value    
-------+-------------+---------+-------------+-------------  
 10000 |       42067 | t       | -2147483648 | -1879048194  
 10001 |       42067 | t       | -1879048193 | -1610612739  
 10002 |       42067 | t       | -1610612738 | -1342177284  
 10003 |       42067 | t       | -1342177283 | -1073741829  
 10004 |       42067 | t       | -1073741828 | -805306374  
 10005 |       42067 | t       | -805306373  | -536870919  
 10006 |       42067 | t       | -536870918  | -268435464  
 10007 |       42067 | t       | -268435463  | -9  
 10008 |       42067 | t       | -8          | 268435446  
 10009 |       42067 | t       | 268435447   | 536870901  
 10010 |       42067 | t       | 536870902   | 805306356  
 10011 |       42067 | t       | 805306357   | 1073741811  
 10012 |       42067 | t       | 1073741812  | 1342177266  
 10013 |       42067 | t       | 1342177267  | 1610612721  
 10014 |       42067 | t       | 1610612722  | 1879048176  
 10015 |       42067 | t       | 1879048177  | 2147483647  
(16 rows)  
  
postgres=# select * from pgs_distribution_metadata.shard_placement;  
 id | shard_id | shard_state | node_name | node_port   
----+----------+-------------+-----------+-----------  
  1 |    10000 |           1 | localhost |      1922  
  2 |    10000 |           1 | localhost |      1923  
  3 |    10001 |           1 | localhost |      1923  
  4 |    10001 |           1 | localhost |      1924  
  5 |    10002 |           1 | localhost |      1924  
  6 |    10002 |           1 | localhost |      1925  
  7 |    10003 |           1 | localhost |      1925  
  8 |    10003 |           1 | localhost |      1922  
  9 |    10004 |           1 | localhost |      1922  
 10 |    10004 |           1 | localhost |      1923  
 11 |    10005 |           1 | localhost |      1923  
 12 |    10005 |           1 | localhost |      1924  
 13 |    10006 |           1 | localhost |      1924  
 14 |    10006 |           1 | localhost |      1925  
 15 |    10007 |           1 | localhost |      1925  
 16 |    10007 |           1 | localhost |      1922  
 17 |    10008 |           1 | localhost |      1922  
 18 |    10008 |           1 | localhost |      1923  
 19 |    10009 |           1 | localhost |      1923  
 20 |    10009 |           1 | localhost |      1924  
 21 |    10010 |           1 | localhost |      1924  
 22 |    10010 |           1 | localhost |      1925  
 23 |    10011 |           1 | localhost |      1925  
 24 |    10011 |           1 | localhost |      1922  
 25 |    10012 |           1 | localhost |      1922  
 26 |    10012 |           1 | localhost |      1923  
 27 |    10013 |           1 | localhost |      1923  
 28 |    10013 |           1 | localhost |      1924  
 29 |    10014 |           1 | localhost |      1924  
 30 |    10014 |           1 | localhost |      1925  
 31 |    10015 |           1 | localhost |      1925  
 32 |    10015 |           1 | localhost |      1922  
(32 rows)  

可以看到,每个分片都有2个副本。就是我们前面创建work table时指定的副本数量。

目前版本pg_shard的使用限制,将来会改进:

  1. 不能使用子查询。
postgres=# insert into customer_reviews select generate_series(1,100);  
ERROR:  0A000: cannot perform distributed planning for the given query  
DETAIL:  Subqueries are not supported in distributed queries.  
LOCATION:  ErrorIfQueryNotSupported, pg_shard.c:567  
  1. 不能使用变量
postgres=# insert into customer_reviews values ('a',now());  
ERROR:  0A000: cannot plan sharded modification containing values which are not constants or constant expressions  
LOCATION:  ErrorIfQueryNotSupported, pg_shard.c:638  
  1. 不能使用绑定变量
postgres=# prepare a (text) as insert into customer_reviews values ($1);  
ERROR:  0A000: PREPARE commands on distributed tables are unsupported  
LOCATION:  PgShardProcessUtility, pg_shard.c:2098  
postgres@digoal-> vi test.sql  
\setrandom id 1 1000000  
insert into customer_reviews values (:id);  
postgres@digoal-> pgbench -M extended -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 1000  
Client 2 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 4 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 1 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 7 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 3 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 0 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 5 aborted in state 1: ERROR:  unrecognized node type: 2100  
Client 6 aborted in state 1: ERROR:  unrecognized node type: 2100  
transaction type: Custom query  
scaling factor: 1  
query mode: extended  
number of clients: 8  
number of threads: 8  
duration: 1000 s  
number of transactions actually processed: 0  
postgres@digoal-> pgbench -M simple -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 1000  
progress: 2.8 s, 0.7 tps, lat 2568.361 ms stddev 16.254  
progress: 3.2 s, 68.3 tps, lat 578.633 ms stddev 1043.300  
progress: 3.2 s, 264.5 tps, lat 8.263 ms stddev 3.007  
progress: 4.0 s, 1193.6 tps, lat 8.561 ms stddev 36.589  
progress: 5.0 s, 1255.6 tps, lat 6.376 ms stddev 5.437  
progress: 6.0 s, 1277.5 tps, lat 6.263 ms stddev 2.644  

中期TODO,不支持分布式JOIN,不支持分布式事务,不支持非分布列的唯一约束,FK约束。
短期TODO,不支持表结构修改,不支持删除表。

另一个问题是,pg_shard需要一个master数据库,而且master没有办法做对等设备。所以master容易成为瓶颈,特别是网络瓶颈和CPU瓶颈。因此要突破这几个瓶颈和问题的话,在OLTP中使用就更加靠谱了。

鉴于以上这些限制,你要考虑清楚pg_shard是否能满足你的需求。
个人认为最靠谱的还是连接池代理,轻量,而且容易做对等设备,可以很好的解决性能和效率的问题。
(但是同样很难实现分布式事务和分布式JOIN,以及分布式唯一和FK约束,但是你得考虑清楚,你是否真的需要这些?)

另外,你可以使用9.4版本的jdbc,已经支持负载均衡和failover了。
http://blog.163.com/digoal@126/blog/static/16387704020158241250463/
如果你没有跨库事务和分布式JOIN,以及分布式唯一和FK约束的需求。目前jdbc 9.4 + plproxy可以完美的实现真正性能线性增长的数据库分片。

分片节点损坏了如何修复?
关闭其中一个节点:

postgres@digoal-> pg_ctl stop -m fast -D /data01/pg_root_1922  
waiting for server to shut down.... done  
server stopped  
postgres@digoal-> psql  
psql (9.4.4)  
Type "help" for help.  
postgres=# select count(*) from customer_reviews ;  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
 count   
-------  
  6296  
(1 row)  

当前shard状态

postgres=# select * from pgs_distribution_metadata.shard_placement;  
 id | shard_id | shard_state | node_name | node_port   
----+----------+-------------+-----------+-----------  
  2 |    10000 |           1 | localhost |      1923  
  3 |    10001 |           1 | localhost |      1923  
  4 |    10001 |           1 | localhost |      1924  
  5 |    10002 |           1 | localhost |      1924  
  6 |    10002 |           1 | localhost |      1925  
  7 |    10003 |           1 | localhost |      1925  
 10 |    10004 |           1 | localhost |      1923  
 11 |    10005 |           1 | localhost |      1923  
 12 |    10005 |           1 | localhost |      1924  
 13 |    10006 |           1 | localhost |      1924  
 14 |    10006 |           1 | localhost |      1925  
 15 |    10007 |           1 | localhost |      1925  
 18 |    10008 |           1 | localhost |      1923  
 19 |    10009 |           1 | localhost |      1923  
 20 |    10009 |           1 | localhost |      1924  
 21 |    10010 |           1 | localhost |      1924  
 22 |    10010 |           1 | localhost |      1925  
 23 |    10011 |           1 | localhost |      1925  
 26 |    10012 |           1 | localhost |      1923  
 27 |    10013 |           1 | localhost |      1923  
 28 |    10013 |           1 | localhost |      1924  
 29 |    10014 |           1 | localhost |      1924  
 30 |    10014 |           1 | localhost |      1925  
 31 |    10015 |           1 | localhost |      1925  
  8 |    10003 |           3 | localhost |      1922  
 32 |    10015 |           3 | localhost |      1922  
  9 |    10004 |           3 | localhost |      1922  
 25 |    10012 |           3 | localhost |      1922  
 24 |    10011 |           3 | localhost |      1922  
 17 |    10008 |           3 | localhost |      1922  
  1 |    10000 |           3 | localhost |      1922  
 16 |    10007 |           3 | localhost |      1922  
(32 rows)  

状态为3的shard是有问题需要修复的。

使用pgbench产生一些数据变更,从而导致这些shard的数据和其他副本不一致。

postgres@digoal-> pgbench -M simple -n -r -P 1 -f ./test.sql -c 8 -j 8 -T 1000  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
WARNING:  Connection failed to localhost:1922  
DETAIL:  Remote message: could not connect to server: Connection refused  
        Is the server running on host "localhost" (::1) and accepting  
        TCP/IP connections on port 1922?  
could not connect to server: Connection refused  
        Is the server running on host "localhost" (127.0.0.1) and accepting  
        TCP/IP connections on port 1922?  
progress: 1.0 s, 167.9 tps, lat 24.011 ms stddev 84.151  
progress: 2.0 s, 1306.0 tps, lat 8.279 ms stddev 40.771  
progress: 3.0 s, 1451.7 tps, lat 5.505 ms stddev 1.778  
progress: 4.0 s, 1469.8 tps, lat 5.445 ms stddev 1.529  
progress: 5.0 s, 1447.0 tps, lat 5.519 ms stddev 2.082  
progress: 6.0 s, 1439.4 tps, lat 5.554 ms stddev 2.656  

启动关闭的节点

postgres@digoal-> pg_ctl start -D /data01/pg_root_1922  
server starting  
postgres@digoal->  0LOG:  00000: redirecting log output to logging collector process  
 0HINT:  Future log output will appear in directory "pg_log".  
 0LOCATION:  SysLogger_Start, syslogger.c:645  
  
postgres@digoal-> psql  
psql (9.4.4)  
Type "help" for help.  
postgres=# select count(*) from customer_reviews ;  
 count   
-------  
 17729  
(1 row)  

修复前,在所有work节点,对应的database中创建pg_shard extension,因为需要用到一个修复函数。

psql -h 127.0.0.1 -p 1922 -c "create extension pg_shard;"  
psql -h 127.0.0.1 -p 1923 -c "create extension pg_shard;"  
psql -h 127.0.0.1 -p 1924 -c "create extension pg_shard;"  
psql -h 127.0.0.1 -p 1925 -c "create extension pg_shard;"  

修复,连接到主节点,执行:

postgres=# select t1.shard_id,t1.node_name,t1.node_port,t2.node_name,t2.node_port from pgs_distribution_metadata.shard_placement t1 , pgs_distribution_metadata.shard_placement t2 where t1.shard_id=t2.shard_id and (t1.node_name||t1.node_port) <> (t2.node_name||t2.node_port) and t1.shard_state=3;  
 shard_id | node_name | node_port | node_name | node_port   
----------+-----------+-----------+-----------+-----------  
    10003 | localhost |      1922 | localhost |      1925  
    10004 | localhost |      1922 | localhost |      1923  
    10008 | localhost |      1922 | localhost |      1923  
    10011 | localhost |      1922 | localhost |      1925  
    10012 | localhost |      1922 | localhost |      1923  
    10015 | localhost |      1922 | localhost |      1925  
(6 rows)  

用这个函数来修复,内部方法是拷贝数据。源拷贝到目标。

public | master_copy_shard_placement     | void             | shard_id bigint, source_node_name text, source_node_port integer, target_node_name text, target_node_port integer    

参数。千万不要搞反了。(pg_shard有保护措施,搞反了会报错)

postgres=# select master_copy_shard_placement(t1.shard_id,t1.node_name,t1.node_port,t2.node_name,t2.node_port) from pgs_distribution_metadata.shard_placement t1 , pgs_distribution_metadata.shard_placement t2 where t1.shard_id=t2.shard_id and (t1.node_name||t1.node_port) <> (t2.node_name||t2.node_port) and t1.shard_state=3;  
ERROR:  22023: source placement must be in finalized state  
LOCATION:  master_copy_shard_placement, repair_shards.c:109  

修复:

postgres=# select master_copy_shard_placement(t1.shard_id,t2.node_name,t2.node_port,t1.node_name,t1.node_port) from pgs_distribution_metadata.shard_placement t1 , pgs_distribution_metadata.shard_placement t2 where t1.shard_id=t2.shard_id and (t1.node_name||t1.node_port) <> (t2.node_name||t2.node_port) and t1.shard_state=3;  
 master_copy_shard_placement   
-----------------------------  
(8 rows)  

拷贝完成后,状态变为1了。

postgres=# select * from pgs_distribution_metadata.shard_placement;  
 id | shard_id | shard_state | node_name | node_port   
----+----------+-------------+-----------+-----------  
  2 |    10000 |           1 | localhost |      1923  
  3 |    10001 |           1 | localhost |      1923  
  4 |    10001 |           1 | localhost |      1924  
  5 |    10002 |           1 | localhost |      1924  
  6 |    10002 |           1 | localhost |      1925  
  7 |    10003 |           1 | localhost |      1925  
 10 |    10004 |           1 | localhost |      1923  
 11 |    10005 |           1 | localhost |      1923  
 12 |    10005 |           1 | localhost |      1924  
 13 |    10006 |           1 | localhost |      1924  
 14 |    10006 |           1 | localhost |      1925  
 15 |    10007 |           1 | localhost |      1925  
 18 |    10008 |           1 | localhost |      1923  
 19 |    10009 |           1 | localhost |      1923  
 20 |    10009 |           1 | localhost |      1924  
 21 |    10010 |           1 | localhost |      1924  
 22 |    10010 |           1 | localhost |      1925  
 23 |    10011 |           1 | localhost |      1925  
 26 |    10012 |           1 | localhost |      1923  
 27 |    10013 |           1 | localhost |      1923  
 28 |    10013 |           1 | localhost |      1924  
 29 |    10014 |           1 | localhost |      1924  
 30 |    10014 |           1 | localhost |      1925  
 31 |    10015 |           1 | localhost |      1925  
  1 |    10000 |           1 | localhost |      1922  
 16 |    10007 |           1 | localhost |      1922  
  8 |    10003 |           1 | localhost |      1922  
  9 |    10004 |           1 | localhost |      1922  
 17 |    10008 |           1 | localhost |      1922  
 24 |    10011 |           1 | localhost |      1922  
 25 |    10012 |           1 | localhost |      1922  
 32 |    10015 |           1 | localhost |      1922  
(32 rows)  

分片数据修复后,查询结果和之前一致。在修复前,pg_shard根据状态过滤了不健康的副本的查询。因此产生的结果是一致的。

postgres=# select count(*) from customer_reviews;  
 count   
-------  
 17729  
(1 row)  

[参考]

  1. https://github.com/citusdata/pg_shard/tree/v1.2.2
  2. https://www.citusdata.com/citus-products/pg-shard/pg-shard-quick-start-guide
    中期TODO,不支持分布式JOIN,不支持分布式事务,不支持非分布列的唯一约束,FK约束。

短期TODO,不支持表结构修改,不支持删除表,不支持子查询。

Limitations  
pg_shard is intentionally limited in scope during its first release, but is fully functional within that scope. We classify pg_shard's current limitations into two groups. In one group, we have features that we don't intend to support in the medium term due to architectural decisions we made:  
Transactional semantics for queries that span across multiple shards - For example, you're a financial institution and you sharded your data based on customer_id. You'd now like to withdraw money from one customer's account and debit it to another one's account, in a single transaction block.  
Unique constraints on columns other than the partition key, or foreign key constraints.  
Distributed JOINs also aren't supported in pg_shard - If you'd like to run complex analytic queries, please consider upgrading to CitusDB.  
Another group of limitations are shorter-term but we're calling them out here to be clear about unsupported features:  
Table alterations are not supported: customers who do need table alterations accomplish them by using a script that propagates such changes to all worker nodes.  
DROP TABLE does not have any special semantics when used on a distributed table. An upcoming release will add a shard cleanup command to aid in removing shard objects from worker nodes.  
Queries such as INSERT INTO foo SELECT bar, baz FROM qux are not supported.  
Besides these limitations, we have a list of features that we're looking to add. Instead of prioritizing this list ourselves, we decided to keep an open discussion on GitHub issues and hear what you have to say. So, if you have a favorite feature missing from pg_shard, please do get in touch!  
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
相关文章
|
SQL 自然语言处理 监控
PostgreSQL插件汇总
一专多长的数据库——PostgreSQL
2586 0
|
6月前
|
SQL 关系型数据库 Go
《增强你的PostgreSQL:最佳扩展和插件推荐》
《增强你的PostgreSQL:最佳扩展和插件推荐》
387 0
|
4月前
|
关系型数据库 数据库 PostgreSQL
PostgreSQL【应用 01】使用Vector插件实现向量相似度查询(Docker部署的PostgreSQL安装pgvector插件说明)和Milvus向量库对比
PostgreSQL【应用 01】使用Vector插件实现向量相似度查询(Docker部署的PostgreSQL安装pgvector插件说明)和Milvus向量库对比
167 1
|
4月前
|
关系型数据库 数据库 PostgreSQL
Docker【应用 03】给Docker部署的PostgreSQL数据库安装PostGIS插件(安装流程及问题说明)
Docker【应用 03】给Docker部署的PostgreSQL数据库安装PostGIS插件(安装流程及问题说明)
139 0
|
4月前
|
SQL 监控 关系型数据库
postgresql|数据库|插件学习(二)---postgresql-12的外置插件pg_profile的启用和使用
postgresql|数据库|插件学习(二)---postgresql-12的外置插件pg_profile的启用和使用
66 0
|
4月前
|
SQL 监控 关系型数据库
postgresql|数据库|插件学习(一)---postgresql-12的内置插件pg_stat_statements的启用和使用
postgresql|数据库|插件学习(一)---postgresql-12的内置插件pg_stat_statements的启用和使用
82 0
|
8月前
|
SQL 存储 自然语言处理
玩转阿里云RDS PostgreSQL数据库通过pg_jieba插件进行分词
在当今社交媒体的时代,人们通过各种平台分享自己的生活、观点和情感。然而,对于平台管理员和品牌经营者来说,了解用户的情感和意见变得至关重要。为了帮助他们更好地了解用户的情感倾向,我们可以使用PostgreSQL中的pg_jieba插件对这些发帖进行分词和情感分析,来构建一个社交媒体情感分析系统,系统将根据用户的发帖内容,自动判断其情感倾向是积极、消极还是中性,并将结果存储在数据库中。
玩转阿里云RDS PostgreSQL数据库通过pg_jieba插件进行分词
|
9月前
|
存储 SQL 关系型数据库
PostgreSQL插件HypoPG:支持虚拟索引
PostgreSQL插件HypoPG:支持虚拟索引
320 0
|
9月前
|
安全 关系型数据库 PostgreSQL
PostgreSQL中插件如何新增一个配置项
PostgreSQL中插件如何新增一个配置项
108 0
|
存储 算法 Cloud Native
PolarDB 开源生态插件心选 - 这些插件让业务战斗力提升100倍!!!
PolarDB 的云原生存算分离架构, 具备低廉的数据存储、高效扩展弹性、高速多机并行计算能力、高速数据搜索和处理; PolarDB与计算算法结合, 将实现双剑合璧, 推动业务数据的价值产出, 将数据变成生产力. 本文将介绍PolarDB 开源版插件生态, 通过插件给数据库加装新的算法和索引|存储结构, 结合PolarDB的大规模存储管理能力, 实现算法和存储双剑合璧, 是企业在数据驱动时代的决胜利器.
348 0

相关产品

  • 云原生数据库 PolarDB