PostgreSQL HOT STANDBY using log shipping

本文涉及的产品
云原生数据库 PolarDB MySQL 版,通用型 2核4GB 50GB
云原生数据库 PolarDB PostgreSQL 版,标准版 2核4GB 50GB
简介:
PostgreSQL HOT STANDBY by log shipping 测试:
一、准备硬件
1. 主节点硬件配置
DISK : 146GB*6
MEM : 14GB
CPU : 2.83GHz*8
2. standby节点硬件配置
DISK : 146GB*4
MEM : 8GB
CPU : 2.0GHz*8

二、准备环境
1. 系统
Red Hat Enterprise Linux Server release 5.5 (Tikanga) x64
2. 时钟同步
8 * * * * /usr/sbin/ntpdate asia.pool.ntp.org && /sbin/hwclock --systohc
3. 配置目录
mkdir -p /database/pgdata/tbs1
mkdir -p /database/pgdata/tbs2
mkdir -p /database/pgdata/tbs3
mkdir -p /database/pgdata/tbs4
mkdir -p /database/pgdata/tbs5
fdisk
mkfs.ext3
mount /dev/cciss/c0d1p1 /database/pgdata/tbs1
mount /dev/cciss/c0d2p1 /database/pgdata/tbs2
mount /dev/cciss/c0d3p1 /database/pgdata/tbs3
mount /dev/cciss/c0d4p1 /database/pgdata/tbs4
mount /dev/cciss/c0d5p1 /database/pgdata/tbs5
master节点:
[root@db-172-16-3-33 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p1      31G  8.1G   21G  29% /
/dev/cciss/c0d0p3      88G  1.7G   81G   3% /opt
tmpfs                 6.9G     0  6.9G   0% /dev/shm
/dev/cciss/c0d1p1     135G   76M  128G   1% /database/pgdata/tbs1
/dev/cciss/c0d2p1     135G  6.1G  122G   5% /database/pgdata/tbs2
/dev/cciss/c0d3p1     135G  3.3G  125G   3% /database/pgdata/tbs3
/dev/cciss/c0d4p1     135G  5.6G  123G   5% /database/pgdata/tbs4
/dev/cciss/c0d5p1     135G   16G  113G  13% /database/pgdata/tbs5

slave节点:
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              31G  3.5G   26G  13% /
/dev/sda3              94G  386M   89G   1% /opt
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sdb1             134G   76M  128G   1% /database/pgdata/tbs1
/dev/sdc1             134G  188M  127G   1% /database/pgdata/tbs2
/dev/sdd1             134G  2.9G  125G   3% /database/pgdata/tbs3
172.16.3.33:/database/pgdata/tbs4
                      135G  5.6G  123G   5% /database/pgdata/tbs4

vi /etc/fstab
4. 在主节点配置nfs,将wal归档目录export出去,
(这里没有使用中央日志服务器,有条件的话还是需要一个比较大的日志服务器为好,以便支持更多的slave节点)
/database/pgdata/tbs4 172.16.3.39/32(rw,no_root_squash,sync)
  slave节点mount这个目录.
  确保master节点和slave节点的postgres用户gid uid相同,否则可能有权限的问题.
5. 配置内核参数等
kernel.shmmni = 4096
kernel.sem = 501000 6412800000 501000 12800
fs.file-max = 767246
net.ipv4.ip_local_port_range = 1024 65000
net.core.rmem_default = 1048576
net.core.rmem_max = 1048576
net.core.wmem_default = 262144
net.core.wmem_max = 262144
net.ipv4.tcp_tw_recycle=1 
net.ipv4.tcp_max_syn_backlog=4096 
net.core.netdev_max_backlog=10000
vm.overcommit_memory=0
net.ipv4.ip_conntrack_max=655360

*  soft    nofile  131072
*  hard    nofile  131072
*  soft    nproc   131072
*  hard    nproc   131072
*  soft    core    unlimited
*  hard    core    unlimited
*  soft    memlock 50000000
*  hard    memlock 50000000

6. 配置系统服务
chkconfig --level 35 nfs on
chkconfig --level 35 portmap pn

7. 配置防火墙
vi /etc/sysconfig/iptables

8. 升级操作系统补丁,驱动等

三、安装PostgreSQL 9.0.2
1. postgres user profile:
export PS1="$USER@`/bin/hostname -s`-> "
export PGPORT=1921
export PGDATA=/database/pgdata/tbs1/pg_root
export PGARCHIVE=/database/pgdata/tbs4/pg_arch

export LANG=en_US.utf8

export PGHOME=/opt/pgsql
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib
export DATE=`date +"%Y%m%d%H%M"`
export PATH=$PGHOME/bin:$PATH:.
export MANPATH=$PGHOME/share/man:$MANPATH
alias rm='rm -i'
alias ll='ls -lh'

2. 配置数据库相关目录
2.1 log
  /var/applog/pg_log
2.2 pghome
  /opt/pgsql
2.3 pgdata
  /database/pgdata/tbs1/pg_root
2.4 pgarchive
  /database/pgdata/tbs4/pg_arch

3. 初始化数据库
initdb -D /database/pgdata/tbs1/pg_root -E UTF8 --locale=C -U postgres -X /database/pgdata/tbs2/pg_xlog -W 

四、配置master节点
1. pg_hba.conf
# "local" is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
# host    all             all             ::1/128                 trust
host all all 0.0.0.0/0  md5

2. postgresql.conf
listen_addresses = '*'          # what IP address(es) to listen on;
port = 1921                             # (change requires restart)
max_connections = 2000                  # (change requires restart)
unix_socket_directory = '/database/pgdata/tbs1/pg_root'         # (change requires restart)
unix_socket_permissions = 0700          # begin with 0 to use octal notation
password_encryption = on
shared_buffers = 2048MB                 # min 128kB
maintenance_work_mem = 2048MB           # min 1MB
max_stack_depth = 8MB                   # min 100kB
wal_level = hot_standby                 # minimal, archive, or hot_standby
synchronous_commit = off                # immediate fsync at commit
wal_sync_method = fdatasync             # the default is the first option 
wal_buffers = 128000kB                  # min 32kB
wal_writer_delay = 20ms                 # 1-10000 milliseconds
checkpoint_segments = 64                # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30min              # range 30s-1h
archive_mode = on               # allows archiving to be done
archive_command = 'cp %p $PGARCHIVE/%f'         # command to use to archive a logfile segment
max_wal_senders = 30            # max number of walsender processes
random_page_cost = 2.0                  # same scale as above
effective_cache_size = 12800MB
constraint_exclusion = partition        # on, off, or partition
log_destination = 'csvlog'              # Valid values are combinations of
logging_collector = on          # Enable capturing of stderr and csvlog
log_directory = '/var/applog/pg_log'            # directory where log files are written,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_truncate_on_rotation = on           # If on, an existing log file of the
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 10MB                # Automatic rotation of logfiles will 
log_min_duration_statement = 1000ms     # -1 is disabled, 0 logs all statements
log_checkpoints = on
log_lock_waits = on                     # log lock waits >= deadlock_timeout
log_statement = 'ddl'                   # none, ddl, mod, all
track_activity_query_size = 2048        # (change requires restart)
autovacuum = on                 # Enable autovacuum subprocess?  'on' 
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
check_function_bodies = on
bytea_output = 'escape'                 # hex, escape
datestyle = 'iso, mdy'
lc_messages = 'C'                       # locale for system error message
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting
default_text_search_config = 'pg_catalog.english'
deadlock_timeout = 1s

3. 启动主节点.

五、传输基础文件至slave节点,模拟一个正在运行的数据库生成复制库的操作.
1. on the master
select pg_start_backup('replication backup');
2. on the master
scp $PGDATA $SLAVE_IP:$PGDATA
3. on the master
select pg_stop_backup();

六、配置slave节点
1. on the slave
chown -R postgres:postgres $PGDATA
su - postgres
cd $PGDATA
rm postmaster.pid
rm -rf pg_xlog
ln -s /database/pgdata/tbs2/pg_xlog ./pg_xlog
2. 配置postgresql.conf
listen_addresses = '*'          # what IP address(es) to listen on;
port = 1921                             # (change requires restart)
max_connections = 2000                  # (change requires restart)
unix_socket_directory = '/database/pgdata/tbs1/pg_root'         # (change requires restart)
unix_socket_permissions = 0700          # begin with 0 to use octal notation
password_encryption = on
shared_buffers = 2048MB                 # min 128kB
maintenance_work_mem = 2048MB           # min 1MB
max_stack_depth = 8MB                   # min 100kB
wal_level = hot_standby                 # minimal, archive, or hot_standby
synchronous_commit = off                # immediate fsync at commit
wal_sync_method = fdatasync             # the default is the first option 
wal_buffers = 128000kB                  # min 32kB
wal_writer_delay = 20ms                 # 1-10000 milliseconds
checkpoint_segments = 64                # in logfile segments, min 1, 16MB each
checkpoint_timeout = 30min              # range 30s-1h
archive_mode = on               # allows archiving to be done
archive_command = 'cp %p $PGARCHIVE/%f'         # command to use to archive a logfile segment
max_wal_senders = 30            # max number of walsender processes
hot_standby = off                        # 这个参数在初始化slave的时候关闭是比较明智的选择,在初始同步完成后在开启
random_page_cost = 2.0                  # same scale as above
effective_cache_size = 12800MB
constraint_exclusion = partition        # on, off, or partition
log_destination = 'csvlog'              # Valid values are combinations of
logging_collector = on          # Enable capturing of stderr and csvlog
log_directory = '/var/applog/pg_log'            # directory where log files are written,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
log_truncate_on_rotation = on           # If on, an existing log file of the
log_rotation_age = 1d                   # Automatic rotation of logfiles will
log_rotation_size = 10MB                # Automatic rotation of logfiles will 
log_min_duration_statement = 1000ms     # -1 is disabled, 0 logs all statements
log_checkpoints = on
log_lock_waits = on                     # log lock waits >= deadlock_timeout
log_statement = 'ddl'                   # none, ddl, mod, all
track_activity_query_size = 2048        # (change requires restart)
autovacuum = on                 # Enable autovacuum subprocess?  'on' 
log_autovacuum_min_duration = 0 # -1 disables, 0 logs all actions and
check_function_bodies = on
bytea_output = 'escape'                 # hex, escape
datestyle = 'iso, mdy'
lc_messages = 'C'                       # locale for system error message
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting
default_text_search_config = 'pg_catalog.english'
deadlock_timeout = 1s


3. 配置recovery.conf
restore_command = 'cp $PGARCHIVE/%f %p'         # e.g. 'cp /mnt/server/archivedir/%f %p'
archive_cleanup_command = 'pg_archivecleanup $PGARCHIVE %r'
standby_mode = 'on'
trigger_file = '/database/pgdata/tbs1/pg_root/postgresql.trigger.1921'

4. 启动slave节点

启动完后,可以通过top看到slave节点在拼命的恢复pg_start_backup以来的所有wal.
恢复完后修改hot_standby = on,重启slave节点

七、测试
1. (on master)新建用户
create role digoal nosuperuser login encrypted password 'digoal';
2. 新建表空间
on master
su - postgres
mkdir /database/pgdata/tbs3/tbs_test
on slave
su - postgres
mkdir /database/pgdata/tbs3/tbs_test
on master
create tablespace tbs_digoal owner test location '/database/pgdata/tbs3/tbs_digoal';
3. (on master)新建数据库
create database digoal with owner digoal template template0 encoding 'UTF8' tablespace tbs_digoal;
4. (on master)新建schema
\c digoal digoal
create schema digoal authorization digoal;
5. (on master)新建表
\c digoal digoal
create table tbl_users (id int8 , nick varchar(32));
6. (on master)插入测试数据
insert into tbl_users select generate_series(1,10000000),'digoal';
由于插入数据量比较大,可以很明显的看到pg_arch目录中的WAL在增加,如
-rw------- 1 postgres postgres      254 Dec 30 15:17 000000010000000000000004.00000020.backup
-rw------- 1 postgres postgres 67108864 Dec 30 15:19 000000010000000100000024
-rw------- 1 postgres postgres 67108864 Dec 30 15:19 000000010000000100000025
生成WAL后,slave节点又开始recover事件,recover完成后等待下一个wal如:
25456 postgres  18   0 2389m 1364  736 S  0.0  0.0   0:00.00 postgres: startup process   waiting for 00000001000000010000000E 

7. (on master)使用DDL测试冲突
on master
alter table tbl_users add column first_name default 'zhou';
alter table tbl_users add column last_name default 'digoal';
在slave恢复期间,在slave节点执行 select count(*) from tbl_users;发生等待事件.

on slave /var/applog/pg_log中查看最近一个日志文件,
2010-12-30 15:04:01.462 CST,"digoal","digoal",25240,"127.0.0.1:43079",4d1c2edf.6298,1,"SELECT waiting",2010-12-30 15:03:59 CST,2/14,0,LOG,00000,"process 25240 still waiting for AccessShareLock on relation 16388 of database 16386 after 1000.564 ms",,,,,,"select count(*) from tbl_users;",22,,"psql"

如果数据库没有其他操作了,不再发生ARCHIVE操作时,你可能会发现主节点已经alter完了,slave节点还是在等待.
原因是alter完的log信息存在的XLOG还没有发生归档,slave节点会一直等待下去(这时可以手工执行pg_switch_xlog).

8. (on master)测试checkpoint
在PostgreSQL中发生checkpoint后,在此之前的WAL在做数据库恢复时就用不到了,因为确保数据都写入数据文件了.
pg_archivecleanup也是根据checkpoint来判断和删除不需要的WAL的.

9. (on slave)测试cleanarchive
在做checkpoint前,去看$PGARCHIVE目录,已经被apply的文件还存在,并没有被pg_archivecleanup命令清除掉,原因就是这些文件是最近一次checkpoint以来的WAL文件,在数据库恢复时是需要用到的.
如果你手工执行pg_archivecleanup $PGARCHIVE 000000010000000200000031 (假设000000010000000200000031这个是在$PGARCHIVE中的一个WAL的文件名)
这条命令将删除000000010000000200000031以前生成的所有WAL文件,一定要小心操作,万一不小心把最近一次CHECKPOINT以来的WAL删除了,
补救的方法是赶紧到master上做一次checkpoint,让slave知道这次checkpoint,否则的话下次slave启动还会读到000000010000000200000031这个文件以前的文件,那时候就只能找到这些文件或重建slave了.

10. (on slave)测试active slave
激活SLAVE很简单,了解到已经apply了最新的WAL后,执行以下
su - postgres
touch /database/pgdata/tbs1/pg_root/postgresql.trigger.1921
数据库会触发激活的动作,激活后/database/pgdata/tbs1/pg_root/postgresql.trigger.1921这个文件会自动删掉,并且recovery.conf被重命名为recovery.done.
激活后的slave不可逆转为slave了.需要重建.

11. 监控
pg_current_xlog_insert_location
pg_current_xlog_location
pg_last_xlog_receive_location
pg_last_xlog_replay_location
top
CREATE OR REPLACE VIEW pg_stat_replication AS
    SELECT
            S.procpid,
            S.usesysid,
            U.rolname AS usename,
            S.application_name,
            S.client_addr,
            S.client_port,
            S.backend_start
    FROM pg_stat_get_activity(NULL) AS S, pg_authid U
    WHERE S.usesysid = U.oid AND S.datid = 0;

八、附pgctl.sh脚本
#!/bin/bash

# environment.
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

export PGHOME=/opt/pgsql
export PATH=$PGHOME/bin:$PATH
export PGDATA=/database/pgdata/tbs1/pg_root
export PGPORT=1921
export LANG='en_US.utf8'
export LD_LIBRARY_PATH=$PGHOME/lib:/lib64:/usr/lib64:/usr/local/lib64:/lib:/usr/lib:/usr/local/lib

RETVAL=1

start() {
su - postgres -c "/usr/bin/nohup $PGHOME/bin/postgres -D $PGDATA -p $PGPORT >/dev/null 2>>/var/applog/pg_log/start_err.log  </dev/null &"
         RETVAL=$?
         return $RETVAL
}

stop() {
su - postgres -c "$PGHOME/bin/pg_ctl stop -D $PGDATA -m fast"
         RETVAL=$?
         return $RETVAL
}

reload() {
su - postgres -c "$PGHOME/bin/pg_ctl reload -D $PGDATA"
         RETVAL=$?
         return $RETVAL
}

# See how we were called.
case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  restart)
        stop
        start
        ;;
  reload)
        reload
        ;;
  *)
        echo $"Usage: $prog {start|stop|restart|reload}"
        exit 2
esac

exit $RETVAL
# Auth Digoal.Zhou
# Corp. Sky-Mobi


久、其他
1. 自9.0以后,PostgreSQL引入了一个叫pg_archivecleanup的模块,简化了standby的配置.以前通过pg_standby来实现的.
2. 第二象限开发的基于PostgreSQL 内部复制的产品
http://projects.2ndquadrant.com/repmgr
3. other blog
PostgreSQL 9.1 Allow standby recovery to switch to a new timeline automatically

PostgreSQL 9.2 devel adding cascading replication support

PostgreSQL HOT STANDBY using Stream

PostgreSQL cluster role switchover between primary and standby

We can ignore the performance influence when use sync replication in PostgreSQL 9.1  

PostgreSQL 9.1 Replication role privilege change to REPLICATION from SUPERUSER

PostgreSQL 9.0.2 Replication Best Practices

PostgreSQL replication monitor

New replication mode: async, write, fsync, replay

PostgreSQL HOT STANDBY using log shipping
相关实践学习
使用PolarDB和ECS搭建门户网站
本场景主要介绍基于PolarDB和ECS实现搭建门户网站。
阿里云数据库产品家族及特性
阿里云智能数据库产品团队一直致力于不断健全产品体系,提升产品性能,打磨产品功能,从而帮助客户实现更加极致的弹性能力、具备更强的扩展能力、并利用云设施进一步降低企业成本。以云原生+分布式为核心技术抓手,打造以自研的在线事务型(OLTP)数据库Polar DB和在线分析型(OLAP)数据库Analytic DB为代表的新一代企业级云原生数据库产品体系, 结合NoSQL数据库、数据库生态工具、云原生智能化数据库管控平台,为阿里巴巴经济体以及各个行业的企业客户和开发者提供从公共云到混合云再到私有云的完整解决方案,提供基于云基础设施进行数据从处理、到存储、再到计算与分析的一体化解决方案。本节课带你了解阿里云数据库产品家族及特性。
目录
相关文章
|
6月前
|
存储 Oracle 关系型数据库
postgresql数据库|wal日志的开启以及如何管理
postgresql数据库|wal日志的开启以及如何管理
1080 0
|
关系型数据库 物联网 PostgreSQL
沉浸式学习PostgreSQL|PolarDB 11: 物联网(IoT)、监控系统、应用日志、用户行为记录等场景 - 时序数据高吞吐存取分析
物联网场景, 通常有大量的传感器(例如水质监控、气象监测、新能源汽车上的大量传感器)不断探测最新数据并上报到数据库. 监控系统, 通常也会有采集程序不断的读取被监控指标(例如CPU、网络数据包转发、磁盘的IOPS和BW占用情况、内存的使用率等等), 同时将监控数据上报到数据库. 应用日志、用户行为日志, 也就有同样的特征, 不断产生并上报到数据库. 以上数据具有时序特征, 对数据库的关键能力要求如下: 数据高速写入 高速按时间区间读取和分析, 目的是发现异常, 分析规律. 尽量节省存储空间
752 1
|
监控 关系型数据库 数据库
PostgreSQL 恢复模式错误日志增强 - 提供正在恢复的WAL(XLOG)文件位置
标签 PostgreSQL , the database system is starting up , the database system is in recovery mode 背景 当数据库异常停库,再次启动时。
2995 0
|
6月前
|
SQL 关系型数据库 数据库
实时计算 Flink版产品使用合集之同步PostgreSQL数据时,WAL 日志无限增长,是什么导致的
实时计算Flink版作为一种强大的流处理和批处理统一的计算框架,广泛应用于各种需要实时数据处理和分析的场景。实时计算Flink版通常结合SQL接口、DataStream API、以及与上下游数据源和存储系统的丰富连接器,提供了一套全面的解决方案,以应对各种实时计算需求。其低延迟、高吞吐、容错性强的特点,使其成为众多企业和组织实时数据处理首选的技术平台。以下是实时计算Flink版的一些典型使用合集。
|
关系型数据库 PostgreSQL
PostgreSQL事务提交日志与CLOG操作初步认识
PostgreSQL事务提交日志与CLOG操作初步认识
225 0
|
消息中间件 数据采集 监控
ELK搭建(七):搭建PostgreSQL慢查询、错误日志监控平台
PostgreSQL是一款功能非常强大的的关系性数据库,适用于需要执行复杂查询的系统。市面上越来越多的公司开始采用PostgreSQL作为主数据库。 今天我们就来讲解如何搭建一个PostgreSQL的慢日志、错误日志监控平台,实时了解到数据库的日志情况,来帮助我们快速排错及优化。
841 0
ELK搭建(七):搭建PostgreSQL慢查询、错误日志监控平台
|
SQL 关系型数据库 数据库
PostgreSQL 服务器日志 pg_log
10.0版本PostgreSQL,存在三种日志 WAL日志,即重做日志,一般不可读 日志对应目录为 $PGDATA/pg_xlog 事务提交日志,记录的是事务的元数据 日志对应目录为 $PGDATA/pg_clog 数据库运行日志 日志对应目录为$PGDATA/pg_log 前两种日志,虽然仍然非常重要,但却是不可读的,我们日常使用不多。
7211 0
|
SQL 存储 弹性计算
日志审计:开通RDS PostgreSQL日志采集
日志审计原已支持采集RDS MySQL的审计日志(基于SQL洞察)、慢日志、性能日志、错误日志,随着用户的使用深入,更多用户对RDS PostgreSQL 日志的采集也提出了需求,日志审计率先响应用户需求,现已支持RDS PostgreSQL 审计日志(基于SQL洞察)、慢日志、错误日志。
|
关系型数据库 PostgreSQL
postgresql 的WAL日志解析工具 pg_waldump
postgresql 的WAL日志解析工具 pg_waldump
1507 0
postgresql 的WAL日志解析工具 pg_waldump
|
SQL Oracle 关系型数据库
xDB Replication Server - PostgreSQL, Oracle, SQL Server, PPAS 全量、增量(redo log based, or trigger based)同步(支持single-master, mult-master同步, 支持DDL)
xDB Replication Server - PostgreSQL, Oracle, SQL Server, PPAS 全量、增量(redo log based, or trigger based)同步(支持single-master, mult-master同步, 支持DDL)
852 0