Greenplum的全量备份介绍, gpcrondump

本文涉及的产品
云数据库 RDS SQL Server,基础系列 2核4GB
RDS PostgreSQL Serverless,0.5-4RCU 50GB 3个月
推荐场景:
对影评进行热评分析
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
简介: 本节介绍一下Greenplum的全量备份。全量备份脚本 #!/bin/bash GPHOME=/home/digoal/gphome # Replace with symlink path if it is present and correct if [ -h ${GPHOME}/

本节介绍一下Greenplum的全量备份。
全量备份脚本

#!/bin/bash
GPHOME=/home/digoal/gphome

# Replace with symlink path if it is present and correct
if [ -h ${GPHOME}/../greenplum-db ]; then
    GPHOME_BY_SYMLINK=`(cd ${GPHOME}/../greenplum-db/ && pwd -P)`
    if [ x"${GPHOME_BY_SYMLINK}" = x"${GPHOME}" ]; then
        GPHOME=`(cd ${GPHOME}/../greenplum-db/ && pwd -L)`/.
    fi
    unset GPHOME_BY_SYMLINK
fi
#setup PYTHONHOME
if [ -x $GPHOME/ext/python/bin/python ]; then
    PYTHONHOME="$GPHOME/ext/python"
fi
PYTHONPATH=$GPHOME/lib/python
PATH=$GPHOME/bin:$PYTHONHOME/bin:$PATH
LD_LIBRARY_PATH=$GPHOME/lib:$PYTHONHOME/lib:$LD_LIBRARY_PATH
OPENSSL_CONF=$GPHOME/etc/openssl.cnf

export GPHOME
export PATH
export LD_LIBRARY_PATH
export PYTHONPATH
export PYTHONHOME
export OPENSSL_CONF

export MASTER_DATA_DIRECTORY=/data01/digoal/gpdatalocal/gpseg-1
export PGHOST=127.0.0.1
export PGPORT=1922
export PGUSER=digoal
export PGDATABASE=postgres
export PGPASSWORD=digoal


backupdir="/data01/digoal/gpbackup"
logdir=$backupdir
masterdir="/data01/digoal/gpdatalocal/gpseg-1"
dbid="digoal"

dat=`psql -A -q -t -h $PGHOST -p $PGPORT -U $PGUSER -c "select ' -x '||string_agg(datname, ' -x ') from pg_database where datname <>'template0'"`
gpcrondump -a -C --dump-stats -g -G -h -r --use-set-session-authorization $dat -u $backupdir --prefix $dbid -l $logdir -d $masterdir

或者

backupdir="/data01/digoal/gpbackup"
logdir=$backupdir
masterdir="/data01/digoal/gpdatalocal/gpseg-1"
dbid="digoal"

for dbname in `psql -A -q -t -h $PGHOST -p $PGPORT -U $PGUSER -c "select datname from pg_database where datname <>'template0'"`
do
now=`date +%Y%m%d%H%M%S`
gpcrondump -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x $dbname -u $backupdir --prefix $dbid -l $logdir -d $masterdir -K $now
done

gpcrondump会检查-K提供的时间戳,如果该时间对应的YYYYMMDD目录中存在比这个时间更未来的备份,则报错。 因此,不同的数据库不能使用同一个时间戳来备份。

$for dbname in `psql -A -q -t -h $PGHOST -p $PGPORT -U $PGUSER -c "select datname from pg_database where datname <>'template0'"`
> do
> gpcrondump -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x $dbname -u $backupdir --prefix $dbid -l $logdir -d $masterdir -K $now
> done
20160416:17:25:55:016061 gpcrondump:db153175032:digoal-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x digoal -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1 -K 20160416171907
20160416:17:25:55:016061 gpcrondump:db153175032:digoal-[CRITICAL]:-gpcrondump failed. (Reason='There is a future dated backup on the system preventing new backups') exiting...
20160416:17:25:55:016151 gpcrondump:db153175032:digoal-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x template1 -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1 -K 20160416171907
20160416:17:25:55:016151 gpcrondump:db153175032:digoal-[CRITICAL]:-gpcrondump failed. (Reason='There is a future dated backup on the system preventing new backups') exiting...
20160416:17:25:55:016241 gpcrondump:db153175032:digoal-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x postgres -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1 -K 20160416171907
20160416:17:25:55:016241 gpcrondump:db153175032:digoal-[CRITICAL]:-gpcrondump failed. (Reason='There is a future dated backup on the system preventing new backups') exiting...
20160416:17:25:55:016331 gpcrondump:db153175032:digoal-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x db2 -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1 -K 20160416171907
20160416:17:25:55:016331 gpcrondump:db153175032:digoal-[CRITICAL]:-gpcrondump failed. (Reason='There is a future dated backup on the system preventing new backups') exiting...
20160416:17:25:55:016421 gpcrondump:db153175032:digoal-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x db3 -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1 -K 20160416171907
20160416:17:25:55:016421 gpcrondump:db153175032:digoal-[CRITICAL]:-gpcrondump failed. (Reason='There is a future dated backup on the system preventing new backups') exiting...
20160416:17:25:56:016511 gpcrondump:db153175032:digoal-[INFO]:-Starting gpcrondump with args: -a -C --dump-stats -g -G -h -r --use-set-session-authorization -x db1 -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1 -K 20160416171907
20160416:17:25:56:016511 gpcrondump:db153175032:digoal-[CRITICAL]:-gpcrondump failed. (Reason='There is a future dated backup on the system preventing new backups') exiting...

备份日志输出到

/data01/digoal/gpbackup

备份数据,自动生成子目录,输出到

/data01/digoal/gpbackup/db_dumps/$YYYYMMDD

每个数据库中都会记录对应数据库的备份历史信息。

postgres=# select * from gpcrondump_history ;
-[ RECORD 14 ]-----+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
rec_date           | 2016-04-16 15:37:02.364166
start_time         | 15:33:19
end_time           | 15:37:00
options            | -a --dump-stats -g -G -h -r -x digoal -x template1 -x postgres -x db2 -x db3 -x db1 -u /data01/digoal/gpbackup --prefix digoal -l /data01/digoal/gpbackup -d /data01/digoal/gpdatalocal/gpseg-1
dump_key           | 20160416153319    -- 备份开始时间戳, 使用gpdbrestore进行恢复时,要用到这个KEY
dump_exit_status   | 0
script_exit_status | 0
exit_text          | COMPLETED

在gpcrondump标准输出的信息中,也包含了dump key

20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Target database                          = digoal
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Dump subdirectory                        = 20160416
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Dump type                                = Full database
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Clear old dump directories               = Off
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Dump start time                          = 16:36:55
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Dump end time                            = 16:36:59
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Status                                   = COMPLETED
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Dump key                                 = 20160416163655
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Dump file compression                    = On
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Vacuum mode type                         = Off
20160416:16:37:00:020201 gpcrondump:db153175032:digoal-[INFO]:-Exit code zero, no warnings generated

以上digoal库,对应的备份文件:

$cd /data01/digoal/gpbackup/db_dumps/20160416

$ll *20160416163655*
-rw------- 1 digoal users  113 Apr 16 16:36 digoal_gp_cdatabase_1_1_20160416163655
-rw------- 1 digoal users 3.2K Apr 16 16:36 digoal_gp_dump_0_2_20160416163655.gz
-rw------- 1 digoal users 3.3K Apr 16 16:36 digoal_gp_dump_0_3_20160416163655.gz
-rw------- 1 digoal users 3.4K Apr 16 16:36 digoal_gp_dump_0_4_20160416163655.gz
-rw------- 1 digoal users 3.3K Apr 16 16:36 digoal_gp_dump_0_5_20160416163655.gz
-rw------- 1 digoal users 3.4K Apr 16 16:36 digoal_gp_dump_0_6_20160416163655.gz
-rw------- 1 digoal users 3.3K Apr 16 16:36 digoal_gp_dump_0_7_20160416163655.gz
-rw------- 1 digoal users 3.4K Apr 16 16:36 digoal_gp_dump_0_8_20160416163655.gz
-rw------- 1 digoal users 3.3K Apr 16 16:36 digoal_gp_dump_0_9_20160416163655.gz
-rw------- 1 digoal users  889 Apr 16 16:36 digoal_gp_dump_1_1_20160416163655.gz
-rw------- 1 digoal users  196 Apr 16 16:36 digoal_gp_dump_1_1_20160416163655_post_data.gz
-rw-r--r-- 1 digoal users    0 Apr 16 16:36 digoal_gp_dump_20160416163655_ao_state_file
-rw-r--r-- 1 digoal users    0 Apr 16 16:36 digoal_gp_dump_20160416163655_co_state_file
-rw-r--r-- 1 digoal users    0 Apr 16 16:36 digoal_gp_dump_20160416163655_last_operation
-rw-r--r-- 1 digoal users 2.3K Apr 16 16:36 digoal_gp_dump_20160416163655.rpt
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_2_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_3_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_4_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_5_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_6_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_7_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_8_20160416163655
-rw------- 1 digoal users 1.3K Apr 16 16:36 digoal_gp_dump_status_0_9_20160416163655
-rw------- 1 digoal users 2.3K Apr 16 16:36 digoal_gp_dump_status_1_1_20160416163655
-rw-r--r-- 1 digoal users 1.0K Apr 16 16:37 digoal_gp_global_1_1_20160416163655
-rw-r--r-- 1 digoal users 4.8K Apr 16 16:36 digoal_gp_statistics_1_1_20160416163655

下一节讲一下增量备份.

Greenplum gpcrondump存在的BUG:
.1. 不支持指定备份用的超级用户名,默认在gpcrondump中会让pg_dump去调用OS对应的用户名.
.2. 备份language handler, create database的DDL时,没有使用双引号引用。 如果用户名包含除小写字母和下划线以外的字符,在还原是会报错。

可能还有其他地方有类似的BUG。  

.3. 不支持删除模板库,在使用gpdbrestore恢复时,如果使用了-e来清除库,会导致失败。
.4. copy需要大量的内存,可能触发OOM。
.5. 执行gpcrondump时,会使用getcwd获得当前目录,所以不能在一个不存在的目录环境下执行.

gpcrondump 以上用到的参数解释

**********************
Return Codes
**********************

The following is a list of the codes that gpcrondump returns.
   0 - Dump completed with no problems
   1 - Dump completed, but one or more warnings were generated
   2 - Dump failed with a fatal error

-a (do not prompt) 

 Do not prompt the user for confirmation. 

-d <master_data_directory> 

 The master host data directory. If not specified, the value set for 
 $MASTER_DATA_DIRECTORY will be used. 

--dump-stats

 Dump optimizer statistics from pg_statistic. Statistics are dumped in the
 master data directory to db_dumps/YYYYMMDD/gp_statistics_1_1_<timestamp>.

-g (copy config files) 

 Secure a copy of the master and segment configuration files 
 postgresql.conf, pg_ident.conf, and pg_hba.conf. These configuration 
 files are dumped in the master or segment data directory to 
 db_dumps/YYYYMMDD/config_files_<timestamp>.tar. 

 If --ddboost is specified, the backup is located on the default storage 
 unit in the directory specified by --ddboost-backupdir when the Data 
 Domain Boost credentials were set.

-G (dump global objects) 

 Use pg_dumpall to dump global objects such as roles and tablespaces. 
 Global objects are dumped in the master data directory to 
 db_dumps/YYYYMMDD/gp_global_1_1_<timestamp>. 

-h (record dump details) 

 Record details of database dump in database table 
 public.gpcrondump_history in database supplied via -x option. Utility 
 will create table if it does not currently exist. 

--incremental (backup changes to append-optimized tables)

 Adds an incremental backup to a backup set. When performing an 
 incremental backup, the complete backup set created prior to the 
 incremental backup must be available. The complete backup set includes 
 the following backup files: 

 * The last full backup before the current incremental backup 

 * All incremental backups created between the time of the full backup 
   the current incremental backup 

 An incremental backup is similar to a full back up except for 
 append-optimized tables, including column-oriented tables. An 
 append-optimized table is backed up only if at least one of the 
 following operations was performed on the table after the last backup. 
   ALTER TABLE 
   INSERT 
   UPDATE
   DELETE
   TRUNCATE 
   DROP and then re-create the table

 For partitioned append-optimized tables, only the changed table 
 partitions are backed up. 

 The -u option must be used consistently within a backup set that 
 includes a full and incremental backups. If you use the -u option with a 
 full backup, you must use the -u option when you create incremental 
 backups that are part of the backup set that includes the full backup. 

 You can create an incremental backup for a full backup of set of 
 database tables. When you create the full backup, specify the --prefix 
 option to identify the backup. To include a set of tables in the full 
 backup, use either the -t option or --table-file option. To exclude a 
 set of tables, use either the -T option or the --exclude-table-file 
 option. See the description of the option for more information on its 
 use. 

 To create an incremental backup based on the full backup of the set of 
 tables, specify the option --incremental and the --prefix option with 
 the string specified when creating the full backup. The incremental 
 backup is limited to only the tables in the full backup. 

 WARNING: gpcrondump does not check for available disk space prior to 
 performing an incremental backup.

 IMPORTANT: An incremental back up set, a full backup and associated 
 incremental backups, must be on a single device. For example, a the 
 backups in a backup set must all be on a file system or must all be on a 
 Data Domain system. 

--prefix <prefix_string> [--list-filter-tables ]

 Prepends <prefix_string> followed by an underscore character (_) to the 
 names of all the backup files created during a backup. 

-r (rollback on failure) 

 Rollback the dump files (delete a partial dump) if a failure is 
 detected. The default is to not rollback. 

-u <backup_directory> 

 Specifies the absolute path where the backup files will be placed on 
 each host. If the path does not exist, it will be created, if possible. 
 If not specified, defaults to the data directory of each instance to be 
 backed up. Using this option may be desirable if each segment host has 
 multiple segment instances as it will create the dump files in a 
 centralized location rather than the segment data directories. 

 Note: This option is not supported if --ddboost is specified. 

--use-set-session-authorization 

 Use SET SESSION AUTHORIZATION commands instead of ALTER OWNER commands 
 to set object ownership. 

-x <database_name> 

 Required. The name of the Greenplum database to dump. Specify multiple times for 
 multiple databases. 
目录
相关文章
|
5月前
|
存储 关系型数据库 Java
polardb有没有搞过pg 全量及增量备份管理的
【1月更文挑战第3天】【1月更文挑战第11篇】 polardb有没有搞过pg 全量及增量备份管理的
70 1
|
12月前
|
关系型数据库 MySQL Linux
TiDB实时同步数据到PostgreSQL(三) ---- 使用pgloader迁移数据
使用PostgreSQL数据迁移神器pgloader从TiDB迁移数据到PostgreSQL,同时说明如何在最新的Rocky Linux 9(CentOS 9 stream也适用)上通过源码编译安装pgloader。
|
2月前
|
关系型数据库 MySQL
MySQL——增量备份和全量备份
MySQL——增量备份和全量备份
49 0
|
4月前
|
消息中间件 监控 关系型数据库
Maxwell - 增量数据同步工具(2)
Maxwell - 增量数据同步工具
|
4月前
|
SQL 关系型数据库 MySQL
Maxwell - 增量数据同步工具(1)
Maxwell - 增量数据同步工具
|
5月前
|
关系型数据库 数据库 PostgreSQL
postgresql|【基于pg_basebackup命令的归档备份和恢复---热备冷恢复方式】
postgresql|【基于pg_basebackup命令的归档备份和恢复---热备冷恢复方式】
207 0
|
5月前
|
关系型数据库 MySQL 分布式数据库
如何备份PolarDB数据?
【5月更文挑战第13天】如何备份PolarDB数据?
104 0
|
5月前
|
Kubernetes 关系型数据库 分布式数据库
PolarDB for PostgreSQL备份问题之备份损坏如何解决
PolarDB for PostgreSQL是基于PostgreSQL开发的一款云原生关系型数据库服务,它提供了高性能、高可用性和弹性扩展的特性;本合集将围绕PolarDB(pg)的部署、管理和优化提供指导,以及常见问题的排查和解决办法。
|
存储 SQL 数据库连接
数据库的数据迁移与备份?
数据库的数据迁移与备份?
|
Oracle 关系型数据库 分布式数据库
7天突破PolarDB for PostgresSQL— 第五讲 备份与恢复实践
分享人:陈卫星,CUUG 高级数据库专职讲师、PostgreSQL 中文社区培训委员会委员 PalarDB社区培训委员会主委
7天突破PolarDB for PostgresSQL— 第五讲 备份与恢复实践
下一篇
无影云桌面