Deepgreen & Greenplum 高可用(一) - Segment节点故障转移

本文涉及的产品
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
云原生数据库 PolarDB MySQL 版,通用型 2核4GB 50GB
简介: 尚书中云:惟事事,乃其有备,有备无患。这教导我们做事一定要有准备,做事尚且如此,在企事业单位发展中处于基础地位的数据仓库软件在运行过程中,何尝不需要有备无患呢? 今天别的不表,主要来谈谈企业级数据仓库软件Deepgreen和Greenplum的高可用特性之一:计算节点镜像。
尚书中云:惟事事,乃其有备,有备无患。这教导我们做事一定要有准备,做事尚且如此,在企事业单位发展中处于基础地位的数据仓库软件在运行过程中,何尝不需要有备无患呢?

今天别的不表,主要来谈谈企业级数据仓库软件Deepgreen和Greenplum的高可用特性之一:计算节点镜像。

一、首先从理论上来讲,正常Segment节点和他的Mirror是分布在不同主机上的,以防止单点故障导致的数据库访问异常。当正常Segment节点出现故障时,Mirror节点可以自动接管Segment节点的服务,数据库仍然可以正常使用。这个过程对前台应用来说是透明的。
02a98e8d8ac018a0e2b54d87928ec8968225f8c6
下面我们来看实操例子:

1.测试环境
本人笔记本电脑;
4核心,8G内存,1T硬盘
已安装1个主节点和2个计算节点,无镜像
2.首先查看集群状态:1 Master,2 Segments
## 执行命令:gpstate
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:---------------------------
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Master instance parameters
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:---------------------------
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Database                 = template1
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Master Port              = 15432
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Master directory         = /dgdata/master/dg-1
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Timeout                  = 600 seconds
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Master standby           = Off
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:---------------------------------------
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-Segment instances that will be started
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:---------------------------------------
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-   Host    Datadir               Port
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-   flash   /dgdata/primary/dg0   25432
20170628:06:26:51:003776 gpstart:flash:dgadmin-[INFO]:-   flash   /dgdata/primary/dg1   25433
3.创建镜像节点目录并添加节点
## 创建mirror目录
dgadmin@flash:~$ mkdir /dgdata/mirror

## 执行命令:gpaddmirrors添加节点

dgadmin@flash:~$ gpaddmirrors

20170628:06:41:50:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Starting gpaddmirrors with args:

20170628:06:41:50:004294 gpaddmirrors:flash:dgadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build Deepgreen DB'

20170628:06:41:50:004294 gpaddmirrors:flash:dgadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.99.00 build Deepgreen DB) on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.9.4-2ubuntu1~14.04.1) 4.9.4 compiled on May 18 2017 05:19:19'

20170628:06:41:50:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Obtaining Segment details from master...

Enter mirror segment data directory location 1 of 2 >

/dgdata/mirror

Enter mirror segment data directory location 2 of 2 >

/dgdata/mirror

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Greenplum Add Mirrors Parameters

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:----------------------------------------------------------

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Greenplum master data directory          = /dgdata/master/dg-1

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Greenplum master port                    = 15432

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Parallel batch limit                     = 16

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:----------------------------------------------------------

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Mirror 1 of 2

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:----------------------------------------------------------

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance host               = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance address            = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance directory          = /dgdata/primary/dg0

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance port               = 25432

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance replication port   = 28432

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance host                = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance address             = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance directory           = /dgdata/mirror/dg0

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance port                = 26432

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance replication port    = 27432

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:----------------------------------------------------------

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Mirror 2 of 2

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:----------------------------------------------------------

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance host               = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance address            = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance directory          = /dgdata/primary/dg1

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance port               = 25433

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Primary instance replication port   = 28433

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance host                = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance address             = flash

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance directory           = /dgdata/mirror/dg1

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance port                = 26433

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:-   Mirror instance replication port    = 27433

20170628:06:42:03:004294 gpaddmirrors:flash:dgadmin-[INFO]:----------------------------------------------------------

Continue with add mirrors procedure Yy|Nn (default=N):

> y

20170628:06:42:06:004294 gpaddmirrors:flash:dgadmin-[INFO]:-2 segment(s) to add

20170628:06:42:06:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Building template directory

20170628:06:42:07:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Validating remote directories

......

20170628:06:42:13:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Copying template directory file

.

20170628:06:42:14:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Configuring new segments

......

20170628:06:42:20:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Cleaning files

.

20170628:06:42:21:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Starting file move procedure for flash:/dgdata/mirror/dg0:content=0:dbid=4:mode=r:status=u

20170628:06:42:21:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Starting file move procedure for flash:/dgdata/mirror/dg1:content=1:dbid=5:mode=r:status=u

20170628:06:42:21:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Updating configuration with new mirrors

20170628:06:42:22:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Updating mirrors

......

20170628:06:42:28:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Starting mirrors

20170628:06:42:28:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...

.......

20170628:06:42:35:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Process results...

20170628:06:42:35:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Updating configuration to mark mirrors up

20170628:06:42:35:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Updating primaries

20170628:06:42:35:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Commencing parallel primary conversion of 2 segments, please wait...

......

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Process results...

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Done updating primaries

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-******************************************************************

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Mirror segments have been added; data synchronization is in progress.

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Data synchronization will continue in the background.

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-Use  gpstate -s  to check the resynchronization progress.

20170628:06:42:41:004294 gpaddmirrors:flash:dgadmin-[INFO]:-*****************************************************************
4.重新查看集群状态:确定2个Segments都已经添加好Mirror
dgadmin@flash:~$ gpstate
20170628:06:59:19:005589 gpstate:flash:dgadmin-[INFO]:-Starting gpstate with args:
20170628:06:59:19:005589 gpstate:flash:dgadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build Deepgreen DB'
20170628:06:59:20:005589 gpstate:flash:dgadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.99.00 build Deepgreen DB) on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.9.4-2ubuntu1~14.04.1) 4.9.4 compiled on May 18 2017 05:19:19'
20170628:06:59:20:005589 gpstate:flash:dgadmin-[INFO]:-Obtaining Segment details from master...
20170628:06:59:20:005589 gpstate:flash:dgadmin-[INFO]:-Gathering data from segments...
....
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-Greenplum instance status summary
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Master instance                                           = Active
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Master standby                                            = No master standby configured
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total segment instance count from metadata                = 4
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Primary Segment Status
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total primary segments                                    = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total primary segment valid (at master)                   = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total primary segment failures (at master)                = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes found                   = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Mirror Segment Status
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total mirror segments                                     = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total mirror segment valid (at master)                    = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total mirror segment failures (at master)                 = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes found                   = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number mirror segments acting as primary segments   = 0
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 2
20170628:06:59:24:005589 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
dgadmin@flash:~$
5.登录数据库进行查询测试
dgadmin@flash:~$ psql -d postgres
psql (8.2.15)
Type "help" for help.
postgres=# \c
You are now connected to database "postgres" as user "dgadmin".
postgres=# \l
                  List of databases
   Name    |  Owner  | Encoding |  Access privileges 
-----------+---------+----------+---------------------
 postgres  | dgadmin | UTF8     |
 template0 | dgadmin | UTF8     | =c/dgadmin         
                                : dgadmin=CTc/dgadmin
 template1 | dgadmin | UTF8     | =c/dgadmin         
                                : dgadmin=CTc/dgadmin
 tpch      | dgadmin | UTF8     |
 tpch_1g   | dgadmin | UTF8     |
(5 rows)
postgres=# \c tpch
You are now connected to database "tpch" as user "dgadmin".
tpch=# \dt
                     List of relations
 Schema |   Name   | Type  |  Owner  |       Storage       
--------+----------+-------+---------+----------------------
 public | customer | table | dgadmin | append only columnar
 public | lineitem | table | dgadmin | append only columnar
 public | nation   | table | dgadmin | append only columnar
 public | orders   | table | dgadmin | append only columnar
 public | part     | table | dgadmin | append only columnar
 public | partsupp | table | dgadmin | append only columnar
 public | region   | table | dgadmin | append only columnar
 public | supplier | table | dgadmin | append only columnar
(8 rows)
tpch=# select * from region;
 r_regionkey |          r_name           |                                                      r_comment                                                     
-------------+---------------------------+---------------------------------------------------------------------------------------------------------------------
           1 | AMERICA                   | hs use ironic, even requests. s
           3 | EUROPE                    | ly final courts cajole furiously final excuse
           0 | AFRICA                    | lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to
           2 | ASIA                      | ges. thinly even pinto beans ca
           4 | MIDDLE EAST               | uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl
(5 rows)
6.模拟实验:Segment1实例异常
## 查询dg1的后台进程号:3898

dgadmin@flash:~$ ps -ef | grep postgres
postgres   1089      1  0 Jun27 ?        00:00:01 /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main -c config_file=/etc/postgresql/9.3/main/postgresql.conf
postgres   1099   1089  0 Jun27 ?        00:00:00 postgres: checkpointer process                                                                                      
postgres   1100   1089  0 Jun27 ?        00:00:00 postgres: writer process                                                                                            
postgres   1101   1089  0 Jun27 ?        00:00:00 postgres: wal writer process                                                                                        
postgres   1102   1089  0 Jun27 ?        00:00:00 postgres: autovacuum launcher process                                                                               
postgres   1103   1089  0 Jun27 ?        00:00:02 postgres: stats collector process                                                                                   
dgadmin    3898      1  0 06:26 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/primary/dg1 -p 25433 -b 3 -z 2 --silent-mode=true -i -M mirrorless -C 1
dgadmin    3899      1  0 06:26 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/primary/dg0 -p 25432 -b 2 -z 2 --silent-mode=true -i -M mirrorless -C 0                                                                        
dgadmin    3915   3898  0 06:26 ?        00:00:00 postgres: port 25433, sweeper process                                                                               
dgadmin    3927      1  0 06:27 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/master/dg-1 -p 15432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E
dgadmin    3928   3927  0 06:27 ?        00:00:00 postgres: port 15432, master logger process                                                                         
dgadmin    3931   3927  0 06:27 ?        00:00:00 postgres: port 15432, stats collector process                                                                       
dgadmin    3932   3927  0 06:27 ?        00:00:00 postgres: port 15432, writer process                                                                                
dgadmin    3933   3927  0 06:27 ?        00:00:00 postgres: port 15432, checkpoint process                                                                            
dgadmin    3934   3927  0 06:27 ?        00:00:00 postgres: port 15432, seqserver process                                                                             
dgadmin    3935   3927  0 06:27 ?        00:00:00 postgres: port 15432, ftsprobe process                                                                              
dgadmin    3936   3927  0 06:27 ?        00:00:00 postgres: port 15432, sweeper process                                                                               
dgadmin    5012      1  0 06:42 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/mirror/dg1 -p 26433 -b 5 -z 2 --silent-mode=true -i -M quiescent -C 1
dgadmin    5013      1  0 06:42 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/mirror/dg0 -p 26432 -b 4 -z 2 --silent-mode=true -i -M quiescent -C 0
dgadmin    5014   5013  0 06:42 ?        00:00:00 postgres: port 26432, logger process                                                                                
dgadmin    5015   5012  0 06:42 ?        00:00:00 postgres: port 26433, logger process                                                                                
dgadmin    5022   5013  0 06:42 ?        00:00:00 postgres: port 26432, mirror process                                                                               
dgadmin    5023   5012  0 06:42 ?        00:00:00 postgres: port 26433, mirror process                                                                                
dgadmin    5106   5103  0 06:42 ?        00:00:00 postgres: port 25433, primary consumer ack process                                                                  
dgadmin    5107   5103  0 06:42 ?        00:00:01 postgres: port 25433, primary recovery process                                                                      
dgadmin    5108   5103  0 06:42 ?        00:00:00 postgres: port 25433, primary verification process                                                                 
dgadmin    5114   3899  0 06:42 ?        00:00:01 postgres: port 25432, primary process                                                                               
dgadmin    5115   5114  0 06:42 ?        00:00:00 postgres: port 25432, primary receiver ack process                                                                  
dgadmin    5116   5114  1 06:42 ?        00:00:20 postgres: port 25432, primary sender process                                                                        
dgadmin    5117   5114  0 06:42 ?        00:00:00 postgres: port 25432, primary consumer ack process                                                                  
dgadmin    5118   5114  0 06:42 ?        00:00:01 postgres: port 25432, primary recovery process                                                                      
dgadmin    5119   5114  0 06:42 ?        00:00:00 postgres: port 25432, primary verification process                                                                  
dgadmin    5919   3733  0 07:08 pts/0    00:00:00 grep --color=auto postgres


## 杀死进程,模拟进程异常
dgadmin@flash:~$ ps -ef | grep postgres
postgres   1089      1  0 Jun27 ?        00:00:01 /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main -c config_file=/etc/postgresql/9.3/main/postgresql.conf
postgres   1099   1089  0 Jun27 ?        00:00:00 postgres: checkpointer process                                                                                      
dgadmin    3898      1  0 06:26 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/primary/dg1 -p 25433 -b 3 -z 2 --silent-mode=true -i -M mirrorless -C 1
dgadmin    3899      1  0 06:26 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/primary/dg0 -p 25432 -b 2 -z 2 --silent-mode=true -i -M mirrorless -C 0                                                                                                     
dgadmin    3927      1  0 06:27 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/master/dg-1 -p 15432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E                                                                                                     
dgadmin    3936   3927  0 06:27 ?        00:00:00 postgres: port 15432, sweeper process                                                                               
dgadmin    5012      1  0 06:42 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/mirror/dg1 -p 26433 -b 5 -z 2 --silent-mode=true -i -M quiescent -C 1
dgadmin    5013      1  0 06:42 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/mirror/dg0 -p 26432 -b 4 -z 2 --silent-mode=true -i -M quiescent -C 0                                                                                   
dgadmin    5919   3733  0 07:08 pts/0    00:00:00 grep --color=auto postgres

## 杀死进程,模拟异常
dgadmin@flash:~$ kill -9 3898

## 重新查看,dg1的进程3989已不存在
dgadmin@flash:~$ ps -ef | grep postgres
postgres   1089      1  0 Jun27 ?        00:00:01 /usr/lib/postgresql/9.3/bin/postgres -D /var/lib/postgresql/9.3/main -c config_file=/etc/postgresql/9.3/main/postgresql.conf
dgadmin    3899      1  0 06:26 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/primary/dg0 -p 25432 -b 2 -z 2 --silent-mode=true -i -M mirrorless -C 0                                                                                                     
dgadmin    3927      1  0 06:27 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/master/dg-1 -p 15432 -b 1 -z 2 --silent-mode=true -i -M master -C -1 -x 0 -E                                                                                                  
dgadmin    5012      1  0 06:42 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/mirror/dg1 -p 26433 -b 5 -z 2 --silent-mode=true -i -M quiescent -C 1
dgadmin    5013      1  0 06:42 ?        00:00:00 /home/dgadmin/vitesse.deepgreendb.16.17.170518/bin/postgres -D /dgdata/mirror/dg0 -p 26432 -b 4 -z 2 --silent-mode=true -i -M quiescent -C 0
dgadmin    5014   5013  0 06:42 ?        00:00:00 postgres: port 26432, logger process                                                                                
dgadmin    5119   5114  0 06:42 ?        00:00:00 postgres: port 25432, primary verification process                                                                  
dgadmin    5939   3733  0 07:09 pts/0    00:00:00 grep --color=auto postgres
7.访问数据库测试连通性:仍然可以正常查询
dgadmin@flash:~$ psql -d tpch
psql (8.2.15)
Type "help" for help.

tpch=# select * from region;
 r_regionkey |          r_name           |                                                      r_comment                                                     
-------------+---------------------------+---------------------------------------------------------------------------------------------------------------------
           1 | AMERICA                   | hs use ironic, even requests. s
           3 | EUROPE                    | ly final courts cajole furiously final excuse
           0 | AFRICA                    | lar deposits. blithely final packages cajole. regular waters are final requests. regular accounts are according to
           2 | ASIA                      | ges. thinly even pinto beans ca
           4 | MIDDLE EAST               | uickly special accounts cajole carefully blithely close requests. carefully final asymptotes haggle furiousl
(5 rows)
8.查看集群状态:显示有1个primary节点异常
dgadmin@flash:~$ gpstate
20170628:07:15:28:006079 gpstate:flash:dgadmin-[INFO]:-Starting gpstate with args:
20170628:07:15:28:006079 gpstate:flash:dgadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build Deepgreen DB'
20170628:07:15:28:006079 gpstate:flash:dgadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.99.00 build Deepgreen DB) on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.9.4-2ubuntu1~14.04.1) 4.9.4 compiled on May 18 2017 05:19:19'
20170628:07:15:28:006079 gpstate:flash:dgadmin-[INFO]:-Obtaining Segment details from master...
20170628:07:15:28:006079 gpstate:flash:dgadmin-[INFO]:-Gathering data from segments...
.
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-Greenplum instance status summary
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Master instance                                           = Active
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Master standby                                            = No master standby configured
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total segment instance count from metadata                = 4
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Primary Segment Status
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total primary segments                                    = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total primary segment valid (at master)                   = 1
20170628:07:15:29:006079 gpstate:flash:dgadmin-[WARNING]:-Total primary segment failures (at master)                = 1                              <<<<<<<<
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[WARNING]:-Total number postmaster processes missing                 = 1                              <<<<<<<<
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes found                   = 1
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Mirror Segment Status
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total mirror segments                                     = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total mirror segment valid (at master)                    = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total mirror segment failures (at master)                 = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes found                   = 2
20170628:07:15:29:006079 gpstate:flash:dgadmin-[WARNING]:-Total number mirror segments acting as primary segments   = 1                              <<<<<<<<
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 1
20170628:07:15:29:006079 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
dgadmin@flash:~$
至此本实验完成,可以看到,在Segment主实例异常情况下,Mirror会立即接管服务,不会对前台应用产生停机影响。

实验完成了,但是事情还没有结束哦,因为这是故障,所以故障产生后,需要进行修复,那么基于整个集群,我们都需要做什么呢?
要点一:及时恢复故障节点
要点二:虽然Mirror可以临时接管服务,保持服务的连续性,但是在实际生产过程中,由于节点及其Mirror的分散性,长期使用Mirror会导致数据分布不均匀,所以故障修复后,建议及时切换回原来的架构。

二、故障节点恢复
节点在故障以后,可以通过gprecoverseg命令恢复故障节点,如下:
dgadmin@flash:~$ gprecoverseg
20170628:22:01:34:001650 gprecoverseg:flash:dgadmin-[INFO]:-Starting gprecoverseg with args:
20170628:22:01:34:001650 gprecoverseg:flash:dgadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build Deepgreen DB'
20170628:22:01:34:001650 gprecoverseg:flash:dgadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.99.00 build Deepgreen DB) on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.9.4-2ubuntu1~14.04.1) 4.9.4 compiled on May 18 2017 05:19:19'
20170628:22:01:34:001650 gprecoverseg:flash:dgadmin-[INFO]:-Checking if segments are ready
20170628:22:01:34:001650 gprecoverseg:flash:dgadmin-[INFO]:-Obtaining Segment details from master...
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-Obtaining Segment details from master...
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-Greenplum instance recovery parameters
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:----------------------------------------------------------
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-Recovery type              = Standard
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:----------------------------------------------------------
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-Recovery 1 of 1
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:----------------------------------------------------------
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Synchronization mode                        = Incremental
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Failed instance host                        = flash
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Failed instance address                     = flash
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Failed instance directory                   = /dgdata/primary/dg1
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Failed instance port                        = 25433
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Failed instance replication port            = 28433
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Recovery Source instance host               = flash
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Recovery Source instance address            = flash
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Recovery Source instance directory          = /dgdata/mirror/dg1
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Recovery Source instance port               = 26433
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Recovery Source instance replication port   = 27433
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:-   Recovery Target                             = in-place
20170628:22:01:35:001650 gprecoverseg:flash:dgadmin-[INFO]:----------------------------------------------------------

Continue with segment recovery procedure Yy|Nn (default=N):
> y
20170628:22:01:38:001650 gprecoverseg:flash:dgadmin-[INFO]:-1 segment(s) to recover
20170628:22:01:38:001650 gprecoverseg:flash:dgadmin-[INFO]:-Ensuring 1 failed segment(s) are stopped

20170628:22:01:39:001650 gprecoverseg:flash:dgadmin-[INFO]:-Ensuring that shared memory is cleaned up for stopped segments
20170628:22:01:45:001650 gprecoverseg:flash:dgadmin-[INFO]:-Updating configuration with new mirrors
20170628:22:01:45:001650 gprecoverseg:flash:dgadmin-[INFO]:-Updating mirrors
......
20170628:22:01:51:001650 gprecoverseg:flash:dgadmin-[INFO]:-Starting mirrors
20170628:22:01:51:001650 gprecoverseg:flash:dgadmin-[INFO]:-Commencing parallel primary and mirror segment instance startup, please wait...
.......
20170628:22:01:58:001650 gprecoverseg:flash:dgadmin-[INFO]:-Process results...
20170628:22:01:58:001650 gprecoverseg:flash:dgadmin-[INFO]:-Updating configuration to mark mirrors up
20170628:22:01:58:001650 gprecoverseg:flash:dgadmin-[INFO]:-Updating primaries
20170628:22:01:58:001650 gprecoverseg:flash:dgadmin-[INFO]:-Commencing parallel primary conversion of 1 segments, please wait...
.......
20170628:22:02:05:001650 gprecoverseg:flash:dgadmin-[INFO]:-Process results...
20170628:22:02:05:001650 gprecoverseg:flash:dgadmin-[INFO]:-Done updating primaries
20170628:22:02:06:001650 gprecoverseg:flash:dgadmin-[INFO]:-******************************************************************
20170628:22:02:06:001650 gprecoverseg:flash:dgadmin-[INFO]:-Updating segments for resynchronization is completed.
20170628:22:02:06:001650 gprecoverseg:flash:dgadmin-[INFO]:-For segments updated successfully, resynchronization will continue in the background.
20170628:22:02:06:001650 gprecoverseg:flash:dgadmin-[INFO]:-
20170628:22:02:06:001650 gprecoverseg:flash:dgadmin-[INFO]:-Use  gpstate -s  to check the resynchronization progress.
20170628:22:02:06:001650 gprecoverseg:flash:dgadmin-[INFO]:-******************************************************************
查看集群状态
dgadmin@flash:~$ gpstate
20170628:22:19:45:003527 gpstate:flash:dgadmin-[INFO]:-Starting gpstate with args:
20170628:22:19:45:003527 gpstate:flash:dgadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build Deepgreen DB'
20170628:22:19:45:003527 gpstate:flash:dgadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.3.99.00 build Deepgreen DB) on x86_64-unknown-linux-gnu, compiled by GCC gcc (Ubuntu 4.9.4-2ubuntu1~14.04.1) 4.9.4 compiled on May 18 2017 05:19:19'
20170628:22:19:45:003527 gpstate:flash:dgadmin-[INFO]:-Obtaining Segment details from master...
20170628:22:19:45:003527 gpstate:flash:dgadmin-[INFO]:-Gathering data from segments...
.
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-Greenplum instance status summary
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Master instance                                           = Active
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Master standby                                            = No master standby configured
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total segment instance count from metadata                = 4
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Primary Segment Status
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total primary segments                                    = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total primary segment valid (at master)                   = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total primary segment failures (at master)                = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes found                   = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Mirror Segment Status
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total mirror segments                                     = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total mirror segment valid (at master)                    = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total mirror segment failures (at master)                 = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files missing              = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid files found                = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs missing               = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of postmaster.pid PIDs found                 = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files missing                   = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number of /tmp lock files found                     = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes missing                 = 0
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number postmaster processes found                   = 2
20170628:22:19:46:003527 gpstate:flash:dgadmin-[WARNING]:-Total number mirror segments acting as primary segments   = 1                              <<<<<<<<
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-   Total number mirror segments acting as mirror segments    = 1
20170628:22:19:46:003527 gpstate:flash:dgadmin-[INFO]:-----------------------------------------------------
三、切换回原集群状态
从第二部分的最后状态代码也可以看出,目前是一个Mirror节点接管了Primary节点的服务,我们本节要讲Mirror的服务交还给Primary。
下面执行:gprecoverseg -r 命令进行节点切换,切换完成后执行gpstate查看状态,代码略。
另外,我们也可以在数据库里面通过字典表查看切换信息:
postgres=# select * from gp_configuration_history;
             time              | dbid |                                             desc                                             
-------------------------------+------+-----------------------------------------------------------------------------------------------
 2017-06-28 06:42:21.789641+08 |    4 | gpaddmirrors: segment config for resync: inserted mirror segment configuration
 2017-06-28 06:42:21.789641+08 |    5 | gpaddmirrors: segment config for resync: inserted mirror segment configuration
 2017-06-28 06:42:21.789641+08 |    2 | gpaddmirrors: segment config for resync: segment mode, status, and replication port
 2017-06-28 06:42:21.789641+08 |    3 | gpaddmirrors: segment config for resync: segment mode, status, and replication port
 2017-06-28 06:42:35.327377+08 |    2 | gpaddmirrors: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 06:42:35.327377+08 |    3 | gpaddmirrors: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 06:42:35.327377+08 |    4 | gpaddmirrors: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 06:42:35.327377+08 |    5 | gpaddmirrors: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 06:56:03.98806+08  |    2 | FTS: changed segment to insync from resync.
 2017-06-28 06:56:03.98806+08  |    4 | FTS: changed segment to insync from resync.
 2017-06-28 06:56:04.045572+08 |    3 | FTS: changed segment to insync from resync.
 2017-06-28 06:56:04.045572+08 |    5 | FTS: changed segment to insync from resync.
 2017-06-28 07:10:07.745568+08 |    3 | FTS: content 1 fault marking status DOWN role m
 2017-06-28 07:10:07.745849+08 |    5 | FTS: content 1 fault marking status UP mode: change-tracking role p
 2017-06-28 22:01:45.341958+08 |    3 | gprecoverseg: segment config for resync: segment mode and status
 2017-06-28 22:01:58.732622+08 |    3 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 22:01:58.732622+08 |    5 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 22:02:57.53962+08  |    5 | FTS: changed segment to insync from resync.
 2017-06-28 22:02:57.53962+08  |    3 | FTS: changed segment to insync from resync.
 2017-06-28 22:31:02.680013+08 |    5 | FTS: content 1 fault marking status DOWN role m
 2017-06-28 22:31:02.680309+08 |    3 | FTS: content 1 fault marking status UP mode: change-tracking role p
 2017-06-28 22:31:12.655695+08 |    5 | gprecoverseg: segment config for resync: segment mode and status
 2017-06-28 22:31:25.848857+08 |    3 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 22:31:25.848857+08 |    5 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
 2017-06-28 22:31:58.752335+08 |    3 | FTS: changed segment to insync from resync.
 2017-06-28 22:31:58.752335+08 |    5 | FTS: changed segment to insync from resync.
(26 rows)
这样,整个今天的分享就结束了,最后再啰嗦一句。从本文例子看出,数据库主/备切换相当的简单,gprecoverseg命令相当的智能,在Primary的主机出现故障之后,Mirror会自动切换为Primary,不影响数据库的正常工作,但是对监控不是很到位的系统来说,不建议使用这个功能,首先这个功能存在一定的BUG,其次,监控不到位,一旦发现切换,并不能及时发现,如果再有节点出现故障,可能对数据恢复造成影响,而且如果单个节点的数据量非常大的时候,gprecoverseg同步数据的过程将会很漫长。

最后,祝大家再Deepgreen & Greenplum的路上一去不复返^_^

同系列文章参考:
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
存储 缓存 负载均衡
高可用mongodb集群(分片+副本):规划及部署
高可用mongodb集群(分片+副本):规划及部署
1275 0
|
7月前
|
存储 SQL 分布式计算
ClickHouse 高可用之副本
ClickHouse 使用副本机制增强数据可用性,复制数据到多个节点以备故障转移。仅MergeTree系列引擎支持副本,需使用`Replicated`前缀。副本是表级别,需先创建对应表结构。配置高可用副本需借助Zookeeper协调。在三台机器上部署,每台有三份数据。创建副本表时,需指定Zookeeper路径和唯一副本名称。通过`CREATE TABLE`语句在每个节点创建副本表并插入数据,然后验证数据同步。还可以使用工具如PrettyZoo查看Zookeeper中的副本表元数据。
160 0
|
2月前
|
SQL 消息中间件 分布式计算
大数据-141 - ClickHouse 集群 副本和分片 Zk 的配置 Replicated MergeTree原理详解(一)
大数据-141 - ClickHouse 集群 副本和分片 Zk 的配置 Replicated MergeTree原理详解(一)
78 0
|
2月前
|
SQL 大数据
大数据-141 - ClickHouse 集群 副本和分片 Zk 的配置 Replicated MergeTree原理详解(二)
大数据-141 - ClickHouse 集群 副本和分片 Zk 的配置 Replicated MergeTree原理详解(二)
87 0
|
5月前
分布式篇问题之集群(Cluster)模式主控节点的高可用性问题如何解决
分布式篇问题之集群(Cluster)模式主控节点的高可用性问题如何解决
|
存储 SQL 容灾
PolarDB-X 存储引擎核心技术 | Paxos多副本
PolarDB-X作为PolarDB分布式版,是阿里巴巴自主设计研发的高性能云原生分布式数据库产品,为用户提供高吞吐、大存储、低延时、易扩展和超高可用的云时代数据库服务。PolarDB-X在架构上可以简单分为CN节点和DN节点。计算节点CN负责SQL的解析和执行,存储节点DN负责数据的分布式事务和高可用存储。本文主要对存储引擎核心中高可用部分详细技术解读。
|
存储 运维 监控
分布式数据库HBase的重要机制和原理的宕机恢复和故障处理
HBase是一个分布式数据库系统,支持高可用性、高性能和高伸缩性。在分布式环境中,数据的分布式存储和管理是非常重要的。HBase通过分布式存储和管理数据来实现高可用性和高性能。同时,HBase还提供了一些重要的机制和原理来支持宕机恢复和故障处理。
462 1
|
存储 弹性计算 关系型数据库
实践教程之如何对PolarDB-X的存储节点发起备库重搭
PolarDB-X 为了方便用户体验,提供了免费的实验环境,您可以在实验环境里体验 PolarDB-X 的安装部署和各种内核特性。除了免费的实验,PolarDB-X 也提供免费的视频课程,手把手教你玩转 PolarDB-X 分布式数据库。本期实验将指导您如何对PolarDB-X的存储节点发起备库重搭。
|
存储 关系型数据库 Java
【Postgres扩展】pg_auto_failover支持高可用性和自动故障转移
【Postgres扩展】pg_auto_failover支持高可用性和自动故障转移
|
存储 运维 Kubernetes
PolarDB-X 数据节点备库重搭
本文主要介绍PolarDB-X中DN(数据节点)备库重搭的背景,以及polardbx-operator上是如何实现DN备库重搭的。
PolarDB-X 数据节点备库重搭