为什么pg_basebackup或pg_start_backup好像hang住确没有开始拷贝文件 - checkpoint 的几种调度(checkpoint_completion_target)

简介: 标签 PostgreSQL , checkpoint , 调度 , lazy , immediate , pg_start_backup , pg_basebackup 背景 PostgreSQL支持在线全量备份与增量归档备份。在线全量备份实际上就是拷贝文件,增量备份则分为两种,一种是基于BLOCK lsn变化的BLOCK即增量备份,另一种是基于WAL的持续归档文件备份。 全量备份通常

标签

PostgreSQL , checkpoint , 调度 , lazy , immediate , pg_start_backup , pg_basebackup


背景

PostgreSQL支持在线全量备份与增量归档备份。在线全量备份实际上就是拷贝文件,增量备份则分为两种,一种是基于BLOCK lsn变化的BLOCK即增量备份,另一种是基于WAL的持续归档文件备份。

全量备份通常使用pg_basebackup客户端实现,或者使用SQL函数pg_start_backup()+COPY文件、打快照的方式实现。

全量备份开启前,需要对数据库做一次checkpoint,并强制开启full page write,确保partial block在后续可以通过wal进行恢复。备份结束时通过pg_stop_backup告知,关闭full page write(如果参数开启了FPW则不受影响)。

有时你可能会发现使用pg_basebackup或pg_start_backup接口时,好像hang住确没有开始拷贝文件。实际上是在做checkpoint,但是为什么这个checkpoint比较慢,但是直接SQL执行checkopint命令确很快呢?

原因是checkpoint分为调度和非调度模式。

调度模式的checkpoint和checkpoint_completion_target以及配置的max_wal_size区间大小有关,checkpoint_completion_target和max_wal_size越大,表示这个checkpoint将在这么大的区间内调度完成,所以总耗时会非常长,好处是减少CHECKPOINT带来的大量刷脏和FSYNC,从而减少抖动。

坏处就是你会发现checkpoint很漫长。

非调度模式的checkpoint,就是尽快完成检查点,会全速刷脏,不进行调度。好处是快,坏处是,如果脏页特别多,可能会有大量IO影响其他会话性能。

postgres=# show max_wal_size ;  
 max_wal_size   
--------------  
 128GB  
(1 row)  
  
postgres=# show min_wal_size;  
 min_wal_size   
--------------  
 32GB  
(1 row)  
             
postgres=# show checkpoint_completion_target ;  
 checkpoint_completion_target   
------------------------------  
 0.1  
(1 row)  

代码中可以看到,checkpoint有如下flag来控制检查点行为。

 * RequestCheckpoint  
 *              Called in backend processes to request a checkpoint  
 *  
 * flags is a bitwise OR of the following:  
 *      CHECKPOINT_IS_SHUTDOWN: checkpoint is for database shutdown.  
 *      CHECKPOINT_END_OF_RECOVERY: checkpoint is for end of WAL recovery.  
 *      CHECKPOINT_IMMEDIATE: finish the checkpoint ASAP,  
 *              ignoring checkpoint_completion_target parameter.  
 *      CHECKPOINT_FORCE: force a checkpoint even if no XLOG activity has occurred  
 *              since the last one (implied by CHECKPOINT_IS_SHUTDOWN or  
 *              CHECKPOINT_END_OF_RECOVERY).  
 *      CHECKPOINT_WAIT: wait for completion before returning (otherwise,  
 *              just signal checkpointer to do it, and return).  
 *      CHECKPOINT_CAUSE_XLOG: checkpoint is requested due to xlog filling.  
 *              (This affects logging, and in particular enables CheckPointWarning.)  
 */  
void  
RequestCheckpoint(int flags)  

start backup如何控制是使用快速checkpoint(非调度模式)、或者调度模式的checkpoint呢?

XLogRecPtr  
do_pg_start_backup(const char *backupidstr, bool fast, TimeLineID *starttli_p,  
                                   StringInfo labelfile, DIR *tblspcdir, List **tablespaces,  
                                   StringInfo tblspcmapfile, bool infotbssize,  
                                   bool needtblspcmapfile)  
{  
  
  
  
                         * Since the fact that we are executing do_pg_start_backup()  
                         * during recovery means that checkpointer is running, we can use  
                         * RequestCheckpoint() to establish a restartpoint.  
                         *  
                         * We use CHECKPOINT_IMMEDIATE only if requested by user (via  
                         * passing fast = true).  Otherwise this can take awhile.  
                         */  
                        RequestCheckpoint(CHECKPOINT_FORCE | CHECKPOINT_WAIT |  
                                                          (fast ? CHECKPOINT_IMMEDIATE : 0));  

1、pg_basebackup客户端命令,通过-c参数控制(fast表示使用非调度模式checkpoint)

  -c, --checkpoint=fast|spread  
                         set fast or spread checkpointing  

2、pg_start_backup SQL函数,通过参数fast控制

postgres=# \df pg_start_backup  
                                                        List of functions  
   Schema   |      Name       | Result data type |                          Argument data types                           | Type   
------------+-----------------+------------------+------------------------------------------------------------------------+------  
 pg_catalog | pg_start_backup | pg_lsn           | label text, fast boolean DEFAULT false, exclusive boolean DEFAULT true | func  
(1 row)  

小结

如果你需要快速的开始备份,可以使用fast(非调度模式)参数。

 

 

免费领取阿里云RDS PostgreSQL实例、ECS虚拟机

目录
相关文章
|
7月前
|
SQL 关系型数据库 数据库连接
`pg_dump` 和 `pg_restore`
`pg_dump` 和 `pg_restore`
97 4
|
缓存 关系型数据库 数据库
PG:checkpoint是什么
PG:checkpoint是什么
104 0
|
关系型数据库 数据库
Harbor断电重启postgres报错 could not locate a valid checkpoint record
Harbor断电重启postgres报错 could not locate a valid checkpoint record
431 0
|
测试技术
pg_rewind实例--could not find previous WAL record at %X/%X
pg_rewind实例--could not find previous WAL record at %X/%X
127 0
|
存储 关系型数据库 MySQL
mysql 数据库无法启动(Ignoring the redo log due to missing MLOG_CHECKPOINT between the checkpoint .... and)
数据库机器的CPU和主板都换了,重新开机,发现mysql数据库无法启动!
415 0
|
关系型数据库 数据库 PostgreSQL
PG异常无法启动的问题:could not read file "pg_logical/replorigin_checkpoint": Success
问题描述 新安装不久的PostgreSQL数据库,断电后重启,查看日志如下 2019-01-08 08:44:19.989 UTC [7493] LOG: database system was interrupted; last known up at 2018-12-24 10:56:28 UTC 2019-01-08 08:44:19.
3415 0