Greenplum 6已合并到PostgreSQL 9.3版本 - 比上一代GP提升:8倍读,195倍更新、删除 - 另有大量PG新特性

本文涉及的产品
RDS PostgreSQL Serverless,0.5-4RCU 50GB 3个月
推荐场景:
对影评进行热评分析
RDS SQL Server Serverless,2-4RCU 50GB 3个月
推荐场景:
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
简介: 标签PostgreSQL , Greenplum , 6 , gin , 异步事务背景Greenplum 6已合并到PostgreSQL 9.3版本 - 相比5性能提升:读8倍,更新、删除195倍 - 另有大量PG新特性,详见PostgreSQL release noteshttps://www.postgresql.org/docs/11/static/release.html例如1、GIN、SPGIST 索引接口。

标签

PostgreSQL , Greenplum , 6 , gin , 异步事务


背景

Greenplum 6已合并到PostgreSQL 9.3版本 - 相比5性能提升:读8倍,更新、删除195倍 - 另有大量PG新特性,详见PostgreSQL release notes

https://www.postgresql.org/docs/11/static/release.html

例如

1、GIN、SPGIST 索引接口。在模糊查询、全文检索、向量相似计算方面终于有索引加速了。

2、异步事务,小事务写入有大幅提升。

3、物化视图,OLAP中很好用的功能。

4、事件触发器,基于事件控制细粒度DDL权限。

5、整体性能增强。

以上特性都是通过升级PostgreSQL版本加入的。

其他增强:

1、增加跨表的分布式死锁检测。

2、更新、删除由表级排他锁改成行级排他锁,大幅提升DML性能。

gpdb 6 部署示例

与如下同测试环境:

《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》

配置ECS虚拟机OS参数 (all host)

1、内核参数

vi /etc/sysctl.conf      
      
# add by digoal.zhou          
fs.aio-max-nr = 1048576          
fs.file-max = 76724600          
          
# 可选:kernel.core_pattern = /data01/corefiles/core_%e_%u_%t_%s.%p                   
# /data01/corefiles 事先建好,权限777,如果是软链接,对应的目录修改为777          
          
kernel.sem = 4096 2147483647 2147483646 512000              
# 信号量, ipcs -l 或 -u 查看,每16个进程一组,每组信号量需要17个信号量。          
          
kernel.shmall = 107374182                
# 所有共享内存段相加大小限制 (建议内存的80%),单位为页。          
kernel.shmmax = 274877906944             
# 最大单个共享内存段大小 (建议为内存一半), >9.2的版本已大幅降低共享内存的使用,单位为字节。          
kernel.shmmni = 819200                   
# 一共能生成多少共享内存段,每个PG数据库集群至少2个共享内存段          
          
net.core.netdev_max_backlog = 10000          
net.core.rmem_default = 262144                 
# The default setting of the socket receive buffer in bytes.          
net.core.rmem_max = 4194304                    
# The maximum receive socket buffer size in bytes          
net.core.wmem_default = 262144                 
# The default setting (in bytes) of the socket send buffer.          
net.core.wmem_max = 4194304                    
# The maximum send socket buffer size in bytes.          
net.core.somaxconn = 4096          
net.ipv4.tcp_max_syn_backlog = 4096          
net.ipv4.tcp_keepalive_intvl = 20          
net.ipv4.tcp_keepalive_probes = 3          
net.ipv4.tcp_keepalive_time = 60          
net.ipv4.tcp_mem = 8388608 12582912 16777216          
net.ipv4.tcp_fin_timeout = 5          
net.ipv4.tcp_synack_retries = 2          
net.ipv4.tcp_syncookies = 1              
# 开启SYN Cookies。当出现SYN等待队列溢出时,启用cookie来处理,可防范少量的SYN攻击          
net.ipv4.tcp_timestamps = 1              
# 减少time_wait          
net.ipv4.tcp_tw_recycle = 0              
# 如果=1则开启TCP连接中TIME-WAIT套接字的快速回收,但是NAT环境可能导致连接失败,建议服务端关闭它          
net.ipv4.tcp_tw_reuse = 1                
# 开启重用。允许将TIME-WAIT套接字重新用于新的TCP连接          
net.ipv4.tcp_max_tw_buckets = 262144          
net.ipv4.tcp_rmem = 8192 87380 16777216          
net.ipv4.tcp_wmem = 8192 65536 16777216          
          
net.nf_conntrack_max = 1200000          
net.netfilter.nf_conntrack_max = 1200000          
          
vm.dirty_background_bytes = 409600000                 
#  系统脏页到达这个值,系统后台刷脏页调度进程 pdflush(或其他) 自动将(dirty_expire_centisecs/100)秒前的脏页刷到磁盘          
#  默认为10%,大内存机器建议调整为直接指定多少字节          
          
vm.dirty_expire_centisecs = 3000                       
#  比这个值老的脏页,将被刷到磁盘。3000表示30秒。          
vm.dirty_ratio = 95                                    
#  如果系统进程刷脏页太慢,使得系统脏页超过内存 95 % 时,则用户进程如果有写磁盘的操作(如fsync, fdatasync等调用),则需要主动把系统脏页刷出。          
#  有效防止用户进程刷脏页,在单机多实例,并且使用CGROUP限制单实例IOPS的情况下非常有效。            
          
vm.dirty_writeback_centisecs = 100                      
#  pdflush(或其他)后台刷脏页进程的唤醒间隔, 100表示1秒。          
          
vm.swappiness = 0          
#  不使用交换分区          
          
vm.mmap_min_addr = 65536          
vm.overcommit_memory = 0               
#  在分配内存时,允许少量over malloc, 如果设置为 1, 则认为总是有足够的内存,内存较少的测试环境可以使用 1 .            
          
vm.overcommit_ratio = 90               
#  当overcommit_memory = 2 时,用于参与计算允许指派的内存大小。          
         
vm.zone_reclaim_mode = 0               
# 禁用 numa, 或者在vmlinux中禁止.           
net.ipv4.ip_local_port_range = 40000 65535              
# 本地自动分配的TCP, UDP端口号范围          
fs.nr_open=20480000          
# 单个进程允许打开的文件句柄上限          
          
# 以下参数请注意          
# vm.extra_free_kbytes = 4096000          
# vm.min_free_kbytes = 2097152  # vm.min_free_kbytes 建议每32G内存分配1G vm.min_free_kbytes         
# 如果是小内存机器,以上两个值不建议设置          
# vm.nr_hugepages = 66536              
#  建议shared buffer设置超过64GB时 使用大页,页大小 /proc/meminfo Hugepagesize          
# vm.lowmem_reserve_ratio = 1 1 1          
# 对于内存大于64G时,建议设置,否则建议默认值 256 256 32      

2、资源限制

vi /etc/security/limits.conf      
      
# nofile超过1048576的话,一定要先将sysctl的fs.nr_open设置为更大的值,并生效后才能继续设置nofile.          
          
* soft    nofile  1024000          
* hard    nofile  1024000          
* soft    nproc   unlimited          
* hard    nproc   unlimited          
* soft    core    unlimited          
* hard    core    unlimited          
* soft    memlock unlimited          
* hard    memlock unlimited         

3、关闭透明大页,使用精准时钟(可选)

vi /etc/rc.local      
      
touch /var/lock/subsys/local      
           
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then          
   echo never > /sys/kernel/mm/transparent_hugepage/enabled          
fi        
    
# tsc 时钟    
echo tsc > /sys/devices/system/clocksource/clocksource0/current_clocksource      
chmod +x /etc/rc.d/rc.local  

部署gpdb (all host)

1、部署依赖(root执行)

rpm -ivh https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm     
  
yum install -y coreutils glib2 lrzsz mpstat dstat sysstat e4fsprogs xfsprogs ntp readline-devel zlib-devel openssl-devel pam-devel libxml2-devel libxslt-devel python-devel tcl-devel gcc make smartmontools flex bison perl-devel perl-ExtUtils* openldap-devel jadetex  openjade bzip2 curl-devel curl apr-devel apr cmake3 python git iotop perf gcc-c++ dstat bzip2-devel krb5-devel libcurl-devel libevent-devel libkadm5 libyaml-devel libxml2-devel openssl-devel perl-ExtUtils-Embed python-devel sysstat python-pip xerces-c-devel  
  
  
  
pip install --upgrade pip  
  
pip install paramiko  pycrypto psutil lockfile pidlockfile  

2、部署gpdb(root执行)

cd ~  
git clone https://github.com/greenplum-db/gpdb  
cd gpdb  
  
pip install -r python-dependencies.txt  
  
pip install -r python-developer-dependencies.txt  
  
  
./configure --disable-orca --with-perl --with-python --with-libxml --prefix=/opt/gpdb6  
make -j 128  
make install  

3、检查当前GPDB的pg版本

/opt/gpdb6/bin/psql -V  
psql (PostgreSQL) 9.3beta1  

规划存储目录 (all host)

useradd digoal  
passwd digoal   
  
  
  
mkdir /data01/gpdb6  
chown digoal:digoal /data01/gpdb6  

初始化GPDB集群 (master host)

1、配置文件

vi /opt/gpdb6/greenplum_path.sh   
  
# 追加  
export MASTER_DATA_DIRECTORY=/data01/gpdb6/gp-1  
export PGDATA=$MASTER_DATA_DIRECTORY  
export PGHOST=127.0.0.1  
export PGPORT=18000  
export PGUSER=digoal  
export PGPASSWORD=123  
export PGDATABASE=postgres  

2、环境变量

su - digoal  
  
  
vi ~/.bash_profile  
  
# 追加  
. /opt/gpdb6/greenplum_path.sh  

3、配置集群主机文件,包含所有节点(master, standby, segment, mirror hosts)

su - digoal  
  
vi hostfile  
  
digoal-citus-gpdb-test001  
digoal-citus-gpdb-test002  
digoal-citus-gpdb-test003  
digoal-citus-gpdb-test004  
digoal-citus-gpdb-test005  
digoal-citus-gpdb-test006  
digoal-citus-gpdb-test007  
digoal-citus-gpdb-test008  
digoal-citus-gpdb-test009  

4、配置host认证

《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》

gpssh-exkeys -f ./hostfile  

5、配置集群初始化文件

如果master host不想配置segment node,则需要修改一下以上hostfile,把master host去掉。

本例在所有主机上初始化segment.

vi cluster.conf  
  
ARRAY_NAME="mpp1 cluster"  
CLUSTER_NAME="mpp1 cluster"  
MACHINE_LIST_FILE=hostfile  
SEG_PREFIX=gp  
DATABASE_PREFIX=gp  
PORT_BASE=28000  
declare -a DATA_DIRECTORY=(/data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6 /data01/gpdb6)  
MASTER_HOSTNAME=digoal-citus-gpdb-test001  
MASTER_DIRECTORY=/data01/gpdb6  
MASTER_PORT=18000  
IP_ALLOW=0.0.0.0/0  
TRUSTED_SHELL=/usr/bin/ssh  
CHECK_POINT_SEGMENTS=32  
ENCODING=UNICODE  
export MASTER_DATA_DIRECTORY  
export TRUSTED_SHELL  
DEFAULT_QD_MAX_CONNECT=250  
QE_CONNECT_FACTOR=5  

6、初始化集群

. /opt/gpdb6/greenplum_path.sh   
  
  
  
gpinitsystem -c cluster.conf -h hostfile   

7、配置参数

gpconfig -c max_connections -v 500 -m 400  
gpconfig -c shared_buffers -v '1GB'  
gpconfig -c max_prepared_transactions -v '1500'  
gpconfig -c max_stack_depth -v '4MB'  
gpconfig -c vacuum_cost_delay -v '0'  
gpconfig -c synchronous_commit -v 'off'  
gpconfig -c wal_buffers -v '16MB'  
gpconfig -c wal_writer_delay -v '10ms'  
gpconfig -c checkpoint_segments -v '128' --skipvalidation  
gpconfig -c random_page_cost -v '1.3'  
gpconfig -c log_statement -v 'ddl'  
gpconfig -c vacuum_freeze_table_age -v '1200000000'  
gpconfig -c autovacuum_freeze_max_age -v '1300000000' --skipvalidation  
gpconfig -c autovacuum_vacuum_cost_delay -v '0' --skipvalidation  
gpconfig -c autovacuum -v 'on' --skipvalidation  

重启实例

gpstop -M fast -a  
  
gpstart -a  

以下参数不允许修改,详见GUC文件

src/backend/utils/misc/guc.c

autovacuum  
autovacuum_freeze_max_age  
autovacuum_vacuum_cost_delay  

GPDB 6 改进评测

1 支持异步事务

PostgreSQL 8.3 就有了,异步事务开启后,对于IO性能较差的盘,小事务的性能提升非常明显。

synchronous_commit = off  
wal_buffers = 16MB  
wal_writer_delay = 10ms  

2 gin 倒排索引

GIN倒排索引,支持多值列(例如数组、JSON、HSTORE、全文检索),多列任意组合查询索引加速。

例子

postgres=# create table t(id int, c1 int[]);  
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.  
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.  
CREATE TABLE  
  
postgres=# create index idx_t_1 on t using gin (c1);  
CREATE INDEX  
  
postgres=# create or replace function gen_rand_arr(int,int) returns int[] as $$  
postgres$#   select array(select (random()*$1)::int from generate_series(1,$2));  
postgres$# $$ language sql strict;  
CREATE FUNCTION  
  
  
postgres=# select gen_rand_arr(100,10);  
         gen_rand_arr            
-------------------------------  
 {3,85,71,73,91,2,29,81,69,77}  
(1 row)  
  
postgres=# insert into t select id,gen_rand_arr(10000,10) from generate_series(1,10000000) t(id);  
INSERT 0 10000000  
  
  
postgres=# explain analyze select * from t where c1 @> array[1,2];  
                                                      QUERY PLAN                                                         
-----------------------------------------------------------------------------------------------------------------------  
 Gather Motion 4:1  (slice1; segments: 4)  (cost=6.50..7.81 rows=1 width=65) (actual time=0.340..0.426 rows=5 loops=2)  
   ->  Bitmap Heap Scan on t  (cost=6.50..7.81 rows=1 width=65) (actual time=0.329..0.335 rows=2 loops=2)  
         Recheck Cond: (c1 @> '{1,2}'::integer[])  
         ->  Bitmap Index Scan on idx_t_1  (cost=0.00..6.50 rows=1 width=0) (actual time=0.259..0.259 rows=0 loops=2)  
               Index Cond: (c1 @> '{1,2}'::integer[])  
   (slice0)    Executor memory: 322K bytes.  
   (slice1)    Executor memory: 779K bytes avg x 4 workers, 907K bytes max (seg1).  Work_mem: 33K bytes max.  
 Memory used:  128000kB  
 Optimizer: legacy query optimizer  
 Total runtime: 1.208 ms  
(10 rows)  
  
set enable_bitmapscan =off;  
  
postgres=# explain analyze select * from t where c1 @> array[1,2];  
                                                         QUERY PLAN                                                           
----------------------------------------------------------------------------------------------------------------------------  
 Gather Motion 4:1  (slice1; segments: 4)  (cost=0.00..16.50 rows=1 width=65) (actual time=503.962..636.780 rows=5 loops=2)  
   ->  Seq Scan on t  (cost=0.00..16.50 rows=1 width=65) (actual time=34.927..588.086 rows=2 loops=2)  
         Filter: (c1 @> '{1,2}'::integer[])  
   (slice0)    Executor memory: 322K bytes.  
   (slice1)    Executor memory: 54K bytes avg x 4 workers, 54K bytes max (seg0).  
 Memory used:  128000kB  
 Optimizer: legacy query optimizer  
 Total runtime: 1273.949 ms  
(8 rows)  

相比全表扫描,使用GIN索引的查询性能提升上千倍。

3 spgist 索引,范围类型

spgist索引接口,以及范围类型。

postgres=# create table t(id int, rg int4range);  
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.  
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.  
CREATE TABLE  
postgres=# create index idx_t_1 on t using spgist (rg);  
CREATE INDEX  
postgres=# insert into t values (1, int4range(1,100));  
INSERT 0 1  
postgres=# insert into t values (2, int4range(101,200));  
INSERT 0 1  
  
postgres=# explain select * from t where rg @> 1;  
                                 QUERY PLAN                                    
-----------------------------------------------------------------------------  
 Gather Motion 4:1  (slice1; segments: 4)  (cost=0.12..2.74 rows=1 width=18)  
   ->  Index Scan using idx_t_1 on t  (cost=0.12..2.74 rows=1 width=18)  
         Index Cond: (rg @> 1)  
 Optimizer: legacy query optimizer  
(4 rows)  
  
postgres=# select * from t where rg @> 1;  
 id |   rg      
----+---------  
  1 | [1,100)  
(1 row)  

范围类型在一些场景的应用

《会议室预定系统实践(解放开发) - PostgreSQL tsrange(时间范围类型) + 排他约束》

《PostgreSQL 黑科技 range 类型及 gist index 20x+ speedup than Mysql index combine query》

4 增加行级排他锁,优化分布式死锁检测

1、原有分布式死锁检测

postgres=# show deadlock_timeout ;  
 deadlock_timeout   
------------------  
 1s  
(1 row)  
  
  
postgres=# create table a (id int, c1 int);  
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.  
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.  
CREATE TABLE  
Time: 156.785 ms  
postgres=# create table b (id int, c1 int);  
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.  
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.  
CREATE TABLE  
Time: 87.583 ms  
  
  
postgres=# insert into a values (1,1);  
INSERT 0 1  
Time: 58.180 ms  
postgres=# insert into b values (1,1);  
INSERT 0 1  
Time: 72.548 ms  
  
  
postgres=# begin;  
BEGIN  
postgres=# update b set c1=2 where id=1;  
UPDATE 1  
...  
postgres=# update a set c1=2 where id=1;  
UPDATE 1  
  
  
postgres=# begin;  
BEGIN  
Time: 1.997 ms  
postgres=# update a set c1=2 where id=1;  
UPDATE 1  
Time: 1.425 ms  
...  
postgres=# update b set c1=2 where id=1;  
ERROR:  deadlock detected  
LINE 1: update b set c1=2 where id=1;  
               ^  
DETAIL:  Process 26021 waits for ExclusiveLock on relation 36930 of database 12097; blocked by process 26091.  
Process 26091 waits for ExclusiveLock on relation 36927 of database 12097; blocked by process 26021.  

gpdb 6 与 gpdb 5 行为一致

postgres=# begin;  
BEGIN  
postgres=# update a set c1=2 where id=1;  
UPDATE 1  
...  
postgres=# update b set c1=2 where id=1;  
UPDATE 1  
  
  
postgres=# begin;  
BEGIN  
postgres=# update b set c1=3 where id=1;  
UPDATE 1  
...  
postgres=# update a set c1=3 where id=1;  
ERROR:  deadlock detected  (seg2 127.0.0.1:24002 pid=3721)  
DETAIL:  Process 3721 waits for ShareLock on transaction 1306618; blocked by process 3703.  
Process 3703 waits for ShareLock on transaction 1306619; blocked by process 3721.  
HINT:  See server log for query details.  

2、gpdb6增加了行级锁

(gpdb6以前为表级排他锁)

对同一张表的delete\update操作,堵塞insert\update\delete

begin;  
update a set c1=2 where id=1;  

堵塞其他会话对同一张表的如下操作:insert\update\delete:

update a set c1=3 where id=2;  
insert into a values (3,1);  
delete from a where id=2;  

gpdb6

行级锁,以上操作不堵塞。

行级分布式死锁检测,参数gp_global_deadlock_detector_period

postgres=# show gp_global_deadlock_detector_period;  
 gp_global_deadlock_detector_period   
------------------------------------  
 2min  
(1 row)  
  
  
  
postgres=# create table a (id int, c1 int);  
NOTICE:  Table doesn't have 'DISTRIBUTED BY' clause -- Using column named 'id' as the Greenplum Database data distribution key for this table.  
HINT:  The 'DISTRIBUTED BY' clause determines the distribution of data. Make sure column(s) chosen are the optimal data distribution key to minimize skew.  
CREATE TABLE  
postgres=# insert into a values (1,1),(2,2);  
INSERT 0 2  
postgres=# begin;  
BEGIN  
postgres=# update a set c1=2 where id=1;  
UPDATE 1  
...  
postgres=# update a set c1=1 where id=2;  
UPDATE 1  
  
  
postgres=# begin;  
BEGIN  
postgres=# update a set c1=3 where id=2;  
UPDATE 1  
...  
postgres=# update a set c1=3 where id=1;  
ERROR:  canceling statement due to user request: "cancelled by global deadlock detector"  

5 TP性能提升

1亿数据量,TPCB,只读,读写混合测试。

pgbench -i -s 1000 -h 127.0.0.1 -p 15432 -U postgres postgres  
  
pgbench -M simple -v -r -P 1 -c 16 -j 16 -h 127.0.0.1 -p 15432 -U postgres postgres -T 120 -S  
  
pgbench -M simple -v -r -P 1 -c 16 -j 16 -h 127.0.0.1 -p 15432 -U postgres postgres -T 120  

1 gpdb 5

postgres=# select version();  
                        version               
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
 PostgreSQL 8.3.23 (Greenplum Database 5.10.2+7615c3b build ga) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.3.1 20170216 (Red Hat 6.3.1-3), 64-bit compiled on Aug 25 2018 08:21:26  
(1 row)  
  
postgres=# show gp_server_version;  
    gp_server_version      
-------------------------  
 5.10.2+7615c3b build ga  
(1 row)  
  
postgres=# show gp_server_version_num;  
 gp_server_version_num   
-----------------------  
 51002  
(1 row)  

1、tpcb 只读

transaction type: <builtin: select only>  
scaling factor: 1000  
query mode: simple  
number of clients: 16  
number of threads: 16  
duration: 120 s  
number of transactions actually processed: 361911  
latency average = 5.306 ms  
latency stddev = 0.854 ms  
tps = 3014.205774 (including connections establishing)  
tps = 3014.474824 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set aid random(1, 100000 * :scale)  
         5.303  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

2、tpcb 读写混合

transaction type: <builtin: TPC-B (sort of)>  
scaling factor: 1000  
query mode: simple  
number of clients: 16  
number of threads: 16  
duration: 120 s  
number of transactions actually processed: 1822  
latency average = 1057.754 ms  
latency stddev = 130.580 ms  
tps = 15.074113 (including connections establishing)  
tps = 15.075498 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.005  \set aid random(1, 100000 * :scale)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.001  \set delta random(-5000, 5000)  
         2.417  BEGIN;  
       990.377  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         7.147  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         1.186  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         1.081  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.674  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
        54.865  END;  

2 gpdb 6

psql  
psql (9.3beta1)  
Type "help" for help.  
  
postgres=# select version();  
                                   version                
---------------------------------------------------------------------------------------------------------------------------------------------------------  
 PostgreSQL 9.3beta1 (Greenplum Database 6.0.0-alpha.0+dev.11201.gb2e98d4 build dev-oss) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit compiled on Sep 26 2018 15:00:09  
(1 row)  
  
postgres=# show gp_server_version;  
               gp_server_version                  
------------------------------------------------  
 6.0.0-alpha.0+dev.11201.gb2e98d4 build dev-oss  
(1 row)  
  
postgres=# show gp_server_version_num;  
 gp_server_version_num   
-----------------------  
 60000  
(1 row)  

1、tpcb 只读

transaction type: <builtin: select only>  
scaling factor: 1000  
query mode: simple  
number of clients: 16  
number of threads: 16  
duration: 120 s  
number of transactions actually processed: 3006326  
latency average = 0.639 ms  
latency stddev = 0.107 ms  
tps = 25052.487094 (including connections establishing)  
tps = 25056.694155 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set aid random(1, 100000 * :scale)  
         0.637  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  

2、tpcb 读写混合

transaction type: <builtin: TPC-B (sort of)>  
scaling factor: 1000  
query mode: simple  
number of clients: 16  
number of threads: 16  
duration: 120 s  
number of transactions actually processed: 351382  
latency average = 5.465 ms  
latency stddev = 9.487 ms  
tps = 2927.029739 (including connections establishing)  
tps = 2927.497105 (excluding connections establishing)  
script statistics:  
 - statement latencies in milliseconds:  
         0.002  \set aid random(1, 100000 * :scale)  
         0.001  \set bid random(1, 1 * :scale)  
         0.001  \set tid random(1, 10 * :scale)  
         0.000  \set delta random(-5000, 5000)  
         0.195  BEGIN;  
         0.779  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;  
         0.692  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;  
         0.703  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;  
         0.685  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;  
         0.566  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);  
         1.838  END;  

小结

Greenplum 6已合并到PostgreSQL 9.3版本,相比5性能提升:读8倍,更新、删除195倍,同时有大量PG新特性,例如

1、GIN、SPGIST 索引接口。在模糊查询、全文检索、向量相似计算方面支持索引加速。 1000万行的数组检索,性能提升了1000倍。

2、支持异步事务,小事务写入有大幅提升。

3、支持物化视图,OLAP中很好用的功能。

4、事件触发器,基于事件控制细粒度DDL权限。

5、整体性能增强,OLTP 读8倍,更新、删除195倍。

以上特性都是通过升级PostgreSQL版本加入的。

其他增强:

1、增加跨表的分布式死锁检测。

2、更新、删除由表级排他锁改成行级排他锁,大幅提升DML性能。表现在OLTP方面,更新、删除混合测试TPCB 提升了195倍。

参考

https://www.postgresql.org/docs/11/static/release.html

https://github.com/greenplum-db/gpdb

《Deepgreen(Greenplum) 多机部署测试 , TPC-H VS citus》

《会议室预定系统实践(解放开发) - PostgreSQL tsrange(时间范围类型) + 排他约束》

《PostgreSQL 黑科技 range 类型及 gist index 20x+ speedup than Mysql index combine query》

相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
5天前
|
关系型数据库 分布式数据库 数据库
开源云原生数据库PolarDB PostgreSQL 15兼容版本正式发布
PolarDB进行了深度的内核优化,从而实现以更低的成本提供商业数据库的性能。
|
16天前
|
关系型数据库 MySQL Shell
MySQL数据库一键安装脚本,适合任何版本
MySQL数据库一键安装脚本,适合任何版本
22 2
|
21天前
|
关系型数据库 MySQL Linux
一文教会你如何在Linux系统中使用Docker安装Mysql 5.7版本 【详细过程+图解】
这篇文章提供了在Linux系统中使用Docker安装Mysql 5.7版本的详细过程和图解,包括安装指定版本、创建实例、启动、使用Navicat连接测试、文件挂载与端口映射、进入容器、配置文件修改以及重新启动容器等步骤。
一文教会你如何在Linux系统中使用Docker安装Mysql 5.7版本 【详细过程+图解】
|
25天前
|
关系型数据库 MySQL 数据库
同一台电脑安装两个不同版本的mysql。简单暴力有效
这篇文章介绍了在同一台电脑上安装并配置两个不同版本的MySQL数据库的过程,包括修改端口号、配置服务、修改连接端口和测试连接情况,以确保不同版本的MySQL能够正常运行而不相互干扰。
同一台电脑安装两个不同版本的mysql。简单暴力有效
|
1月前
|
算法 关系型数据库 MySQL
揭秘MySQL中的版本号排序:这个超级算法将颠覆你的排序世界!
【8月更文挑战第8天】在软件开发与数据管理中,正确排序版本号对软件更新及数据分析至关重要。因MySQL默认按字符串排序版本号,可能出现&#39;1.20.0&#39;在&#39;1.10.0&#39;之前的不合理情况。解决办法是将版本号各部分转换为整数后排序。例如,使用`SUBSTRING_INDEX`和`CAST`函数从`software`表的`version`字段提取并转换版本号,再按这些整数排序。这种方法可确保版本号按逻辑正确排序,适用于&#39;major.minor.patch&#39;格式的版本号。对于更复杂格式,需调整处理逻辑。掌握此技巧可有效应对版本号排序需求。
65 3
|
20天前
|
关系型数据库 MySQL 数据安全/隐私保护
【MySQL】手把手教你MySQL各版本忘记密码如何处理
【MySQL】手把手教你MySQL各版本忘记密码如何处理
|
2月前
|
关系型数据库 MySQL 数据库
|
2月前
|
存储 负载均衡 关系型数据库
面试题MySQL问题之通过配置FastDFS提高性能如何解决
面试题MySQL问题之通过配置FastDFS提高性能如何解决
38 1
|
1月前
|
存储 关系型数据库 MySQL
XtraBackup 的版本大揭秘:突破迷雾,揭开其支持 MySQL 版本的震撼真相,捍卫数据安全!
【8月更文挑战第7天】XtraBackup 是 MySQL 备份领域的热门工具,适用于多个版本。它为 MySQL 8.0 提供高效备份,确保数据安全。同样支持 MySQL 5.7 和其他稳定版如 5.6,精确备份各类数据结构。
55 0
|
3月前
|
SQL 关系型数据库 MySQL
解决 MySQL 5.7 及以上版本中的 “ONLY_FULL_GROUP_BY“ 错误
解决 MySQL 5.7 及以上版本中的 “ONLY_FULL_GROUP_BY“ 错误
419 3

相关产品

  • 云原生数据库 PolarDB
  • 云数据库 RDS PostgreSQL 版