Greenplum的全量恢复介绍, gpdbrestore-阿里云开发者社区

开发者社区> 阿里云数据库> 正文
登录阅读全文

Greenplum的全量恢复介绍, gpdbrestore

简介: 本节介绍一下Greenplum的全量恢复恢复时需要指定dump key ( 即gpcrondump时,每个数据库备份都带的时间戳)全量恢复需要考虑几个因素, DROP DATABASE, TRUNCATE TABLE, DROP TABLE.这些和gpcrondump或者gpdbrestore的参.

本节介绍一下Greenplum的全量恢复
恢复时需要指定dump key ( 即gpcrondump时,每个数据库备份都带的时间戳)
全量恢复需要考虑几个因素, DROP DATABASE, TRUNCATE TABLE, DROP TABLE.
这些和gpcrondump或者gpdbrestore的参数有关。
同时也关系到数据是否需要先被清除掉,然后从备份恢复。

#!/bin/bash
GPHOME=/home/digoal/gphome

# Replace with symlink path if it is present and correct
if [ -h ${GPHOME}/../greenplum-db ]; then
    GPHOME_BY_SYMLINK=`(cd ${GPHOME}/../greenplum-db/ && pwd -P)`
    if [ x"${GPHOME_BY_SYMLINK}" = x"${GPHOME}" ]; then
        GPHOME=`(cd ${GPHOME}/../greenplum-db/ && pwd -L)`/.
    fi
    unset GPHOME_BY_SYMLINK
fi
#setup PYTHONHOME
if [ -x $GPHOME/ext/python/bin/python ]; then
    PYTHONHOME="$GPHOME/ext/python"
fi
PYTHONPATH=$GPHOME/lib/python
PATH=$GPHOME/bin:$PYTHONHOME/bin:$PATH
LD_LIBRARY_PATH=$GPHOME/lib:$PYTHONHOME/lib:$LD_LIBRARY_PATH
OPENSSL_CONF=$GPHOME/etc/openssl.cnf

export GPHOME
export PATH
export LD_LIBRARY_PATH
export PYTHONPATH
export PYTHONHOME
export OPENSSL_CONF

export MASTER_DATA_DIRECTORY=/data01/digoal/gpdatalocal/gpseg-1
export PGHOST=127.0.0.1
export PGPORT=1922
export PGUSER=digoal
export PGDATABASE=postgres
export PGPASSWORD=digoal

backupdir="/data01/digoal/gpbackup"
logdir=$backupdir
masterdir="/data01/digoal/gpdatalocal/gpseg-1"
dat=`psql -A -q -t -h $PGHOST -p $PGPORT -U $PGUSER -c "select ' -x '||string_agg(datname, ' -x ') from pg_database where datname <>'template0'"`
dbid="digoal"

for dumpid in 20160416172728 20160416172733 20160416172738 20160416172743 20160416172748 20160416172753
do
  gpdbrestore -a -e -d $masterdir --prefix $dbid -u $backupdir --restore-stats include --report-status-dir $logdir -t $dumpid
done

特别注意
.1. gpdbrestore -e 参数表示恢复前是否执行 drop database, 然后执行 create database。

所以如果目标环境没有对应的数据库的话,不需要加-e参数,否则会报错。  
表级恢复也不要使用-e。  

.2. 如果 gpcrondump 时使用了-C 参数, 则恢复时会先执行DROP TABLE再执行建表的动作。
.3. 如果 gpcrondump 时没有使用 -C 参数,参数恢复时想先清理数据的话,可以使用gpdbrestore的--truncate参数
(--truncate只能是表级恢复模式下使用, 即与-T . 或 --table-file 一同使用)
.4. Greenplum不允许删除模板库, 所以如果使用-e恢复模板库,会报错。 解决方法是改gpcrondump代码,对于模板库特殊处理,例如drop schema的方式清理模板库, 跳过模板库的DROP database报错以及create database 报错。

本节用到的 gpdbrestore 参数介绍

-a (do not prompt) 

 Do not prompt the user for confirmation.

-b <YYYYMMDD> 

 Looks for dump files in the segment data directories on the Greenplum 
 Database array of hosts in db_dumps/<YYYYMMDD>.

-d <master_data_directory>

 Optional. The master host data directory. If not specified, the value 
 set for $MASTER_DATA_DIRECTORY will be used. 
 
-e (drop target database before restore) 

 Drops the target database before doing the restore and then recreates 
 it. 

-G [include|only]

 Restores global objects such as roles and tablespaces if the global 
 object dump file db_dumps/<date>/gp_global_1_1_<timestamp> is found in 
 the master data directory.

 Specify either "-G only" to only restore the global objects dump file
 or "-G include" to restore global objects along with a normal restore.
 Defaults to "include" if neither argument is provided.

-l <logfile_directory>

 The directory to write the log file. Defaults to ~/gpAdminLogs. 

-m (restore metadata only)

 Performs a restore of database metadata (schema and table definitions, SET
 statements, and so forth) without restoring data.  If the --restore-stats or
 -G options are provided as well, statistics or globals will also be restored.

 The --noplan and --noanalyze options are not supported in conjunction with
 this option, as they affect the restoration of data and no data is restored.

--prefix <prefix_string> 

 If you specified the gpcrondump option --prefix <prefix_string> to create 
 the backup, you must specify this option with the <prefix_string> when 
 restoring the backup. 

 If you created a full backup of a set of tables with gpcrondump and 
 specified a prefix, you can use gpcrondump with the options 
 --list-filter-tables and --prefix <prefix_string> to list the tables
 that were included or excluded for the backup. 

--restore-stats [include|only]

 Restores optimizer statistics if the statistics dump file
 db_dumps/<date>/gp_statistics_1_1_<timestamp> is found in the master data
 directory. Setting this option automatically skips the final analyze step,
 so it is not necessary to also set the --noanalyze flag in conjunction with
 this one.

-t <timestamp_key>

 The 14 digit timestamp key that uniquely identifies a backup set of data 
 to restore. It is of the form YYYYMMDDHHMMSS. Looks for dump files 
 matching this timestamp key in the segment data directories db_dumps 
 directory on the Greenplum Database array of hosts. 

-T <schema>.<table_name>

 Table names to restore, specify multiple times for multiple tables. The 
 named table(s) must exist in the backup set of the database being restored. 
 Existing tables are not automatically truncated before data is restored 
 from backup. If your intention is to replace existing data in the table 
 from backup, truncate the table prior to running gpdbrestore -T. 

-S <schema>

 Schema names to restore, specify multiple times for multiple schemas. 
 Existing tables are not automatically truncated before data is restored 
 from backup. If your intention is to replace existing data in the table 
 from backup, truncate the table prior to running gpdbrestore -S. 

--truncate

 Truncate table data before restoring data to the table from the backup.
 This option is supported only when restoring a set of tables with the 
 option -T or --table-file. 
 This option is not supported with the -e option.

-u <backup_directory> 

 Specifies the absolute path to the directory containing the db_dumps 
 directory on each host. If not specified, defaults to the data directory 
 of each instance to be backed up. Specify this option if you specified a 
 backup directory with the gpcrondump option -u when creating a backup 
 set. 
  
 If <backup_directory> is not writable, backup operation report status 
 files are written to segment data directories. You can specify a 
 different location where report status files are written with the 
 --report-status-dir option. 

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

分享:
阿里云数据库
使用钉钉扫一扫加入圈子
+ 订阅

帮用户承担一切数据库风险,给您何止是安心!

官方博客
链接