本文以实践的方式展示如何利用开源PolarDB-PG和Patroni搭建高可用集群。文中使用了三台ECS,内核使用了开源PolarDB-PG 15版本,patroni版本为4.0.3,etcd版本为3.5.0,其它系统资源及部署组件见下表:
如果没有特殊说明,表示命令需要在三台机器上均要执行。
ETCD安装
准备用户
首先需要准备一个用户,用于管理ETCD与初始化数据库,本文使用了同一个用户postgres。也可以使用不同的用户,比如分别用etcd与postgres。
useradd postgres
安装ETCD
接下来安装ETCD,这里采用了下载tar安装包的方式手动安装,也可以使用yum install etcd方式安装。
# 获取安装包 wget https://github.com/etcd-io/etcd/releases/download/v3.5.0/etcd-v3.5.0-linux-amd64.tar.gz tar -xvf etcd-v3.5.0-linux-amd64.tar.gz cd etcd-v3.5.0-linux-amd64 sudo mv etcd /usr/local/bin/ sudo mv etcdctl /usr/local/bin/ # 创建目录并把权限赋给postgres mkdir /etc/etcd mkdir /var/lib/etcd chown -R postgres:postgres /usr/local/bin/etcd /etc/etcd chown -R postgres:postgres /usr/local/bin/etcdctl chown -R postgres:postgres /var/lib/etcd
准备ETCD配置文件
以下分别为三个节点的etcd配置文件,使用的ETCD_DATA_DIR为/var/lib/etcd
cat > /etc/etcd/etcd.conf << EOF ETCD_NAME=etcd1 ETCD_DATA_DIR="/var/lib/etcd" ETCD_LISTEN_PEER_URLS="http://172.17.206.71:2380" ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://172.17.206.71:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.17.206.71:2380" ETCD_INITIAL_CLUSTER="etcd1=http://172.17.206.71:2380,etcd2=http://172.17.206.72:2380,etcd3=http://172.17.206.73:2380" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_ADVERTISE_CLIENT_URLS="http://172.17.206.71:2379" EOF cat > /etc/etcd/etcd.conf << EOF ETCD_NAME=etcd2 ETCD_DATA_DIR="/var/lib/etcd" ETCD_LISTEN_PEER_URLS="http://172.17.206.72:2380" ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://172.17.206.72:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.17.206.72:2380" ETCD_INITIAL_CLUSTER="etcd1=http://172.17.206.71:2380,etcd2=http://172.17.206.72:2380,etcd3=http://172.17.206.73:2380" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_ADVERTISE_CLIENT_URLS="http://172.17.206.72:2379" EOF cat > /etc/etcd/etcd.conf << EOF ETCD_NAME=etcd3 ETCD_DATA_DIR="/var/lib/etcd" ETCD_LISTEN_PEER_URLS="http://172.17.206.73:2380" ETCD_LISTEN_CLIENT_URLS="http://127.0.0.1:2379,http://172.17.206.73:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="http://172.17.206.73:2380" ETCD_INITIAL_CLUSTER="etcd1=http://172.17.206.71:2380,etcd2=http://172.17.206.72:2380,etcd3=http://172.17.206.73:2380" ETCD_INITIAL_CLUSTER_STATE="new" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_ADVERTISE_CLIENT_URLS="http://172.17.206.73:2379" EOF
配置service并启动ETCD
这一步是为了配置由systemd管理ETCD,这样能实现开机自启动,并且在进程异常退出的时候能自动重启。先创建service文件,
vi /usr/lib/systemd/system/etcd.service
并填为以下内容,这里使用的配置文件是上一节中的/etc/etcd/etcd.conf
[Unit] Description=Etcd Server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify WorkingDirectory=/var/lib/etcd/ EnvironmentFile=-/etc/etcd/etcd.conf User=postgres # set GOMAXPROCS to number of processors ExecStart=/bin/bash -c "GOMAXPROCS=$(nproc) /usr/local/bin/etcd" Restart=on-failure LimitNOFILE=65536 RestartSec=5s StandardOutput=journal StandardError=journal [Install] WantedBy=multi-user.target
启动service并查看状态,如果使用的是yum install etcd的方式,可以直接跳到这步。
sudo systemctl daemon-reload sudo systemctl start etcd sudo systemctl enable etcd sudo systemctl status etcd
检查状态
具体使用的命令会因为ETCD版本有所不同,可以使用
etcdctl endpoint health --cluster etcdctl member list
PolarDB-PG15安装
获取rpm包
从github上下载开源PolarDB-PG15的rpm包,并在每台机器上安装。这里使用的是Centos 8系统,因此下载el8的包即可(https://github.com/ApsaraDB/PolarDB-for-PostgreSQL/releases)
rpm -ivh /root/PolarDB-15.8.2.0-f3737b0b.el8.x86_64.rpm
初始化数据目录
切换到postgres用户,并在第一台机器上initdb初始化数据目录,本文使用的目录为/home/postgres/polar15,这个步骤只在第一台机器上完成:
su - postgres mkdir /home/postgres/ /u01/polardb_pg/bin/initdb -D /home/postgres/polar15
接下来可以配置远程登录权限,在/home/postgres/polar15/pg_hba.conf最后加上
# IPv4 local connections: host all all 127.0.0.1/32 md5 host all all 0.0.0.0/0 md5 # replication privilege: host replication replicator 0.0.0.0/24 md5
/home/postgres/polar15/postgresql.conf中配置listen_addresses = '*'
启动数据库
# 启动 /u01/polardb_pg/bin/pg_ctl -D /home/postgres/polar15 -l logfile start
接下来在第一台机器上连接数据库,并创建流复制需要的ROLE,密码可以按需要设置,例如本文配置为pg_rep。
# 创建复制ROLE /u01/polardb_pg/bin/psql -h localhost -p 5432 -U postgres -d postgres create role replicator login replication encrypted password 'pg_rep';
patroni安装
接下来进行patroni的安装,首先使用root用户安装patroni、python3-psycopg2和patroni[etcd]。
pip3 install patroni yum install python3-psycopg2 -y pip3 install patroni[etcd]
准备配置文件
首先在三台机器都创建需要的目录和配置文件postgres-pg01.yml
su - postgres mkdir /home/postgres/etc mkdir /home/postgres/log vi /home/postgres/etc/postgres-pg01.yml
节点1配置文件
scope: postgres namespace: /service/ name: postgresql0 restapi: listen: 0.0.0.0:8008 connect_address: 127.0.0.1:8008 etcd3: hosts: - 172.17.206.71:2379 - 172.17.206.72:2379 - 172.17.206.73:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true parameters: max_connections: 100 max_locks_per_transaction: 64 max_worker_processes: 8 initdb: - encoding: UTF8 - data-checksums pg_hba: - host all all 0.0.0.0/0 md5 users: admin: password: adminpassword options: - createrole - createdb postgresql: listen: 0.0.0.0:5432 connect_address: 172.17.206.71:5432 # 替换为实际的IP地址 data_dir: /home/postgres/polar15 pgpass: /tmp/pgpass0 bin_dir: /u01/polardb_pg/bin/ authentication: superuser: username: postgres password: yourpassword replication: username: replicator password: pg_rep parameters: unix_socket_directories: '/tmp' tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
节点2配置文件
scope: postgres namespace: /service/ name: postgresql1 restapi: listen: 0.0.0.0:8008 connect_address: 127.0.0.1:8008 etcd3: hosts: - 172.17.206.71:2379 - 172.17.206.72:2379 - 172.17.206.73:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true parameters: max_connections: 100 max_locks_per_transaction: 64 max_worker_processes: 8 initdb: - encoding: UTF8 - data-checksums pg_hba: - host all all 0.0.0.0/0 md5 users: admin: password: adminpassword options: - createrole - createdb postgresql: listen: 0.0.0.0:5432 connect_address: 172.17.206.72:5432 # 替换为实际的IP地址 data_dir: /home/postgres/polar15 pgpass: /tmp/pgpass0 bin_dir: /u01/polardb_pg/bin/ authentication: superuser: username: postgres password: yourpassword replication: username: replicator password: pg_rep parameters: unix_socket_directories: '/tmp' tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
节点3配置文件
scope: postgres namespace: /service/ name: postgresql2 restapi: listen: 0.0.0.0:8008 connect_address: 127.0.0.1:8008 etcd3: hosts: - 172.17.206.71:2379 - 172.17.206.72:2379 - 172.17.206.73:2379 bootstrap: dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 postgresql: use_pg_rewind: true parameters: max_connections: 100 max_locks_per_transaction: 64 max_worker_processes: 8 initdb: - encoding: UTF8 - data-checksums pg_hba: - host all all 0.0.0.0/0 md5 users: admin: password: adminpassword options: - createrole - createdb postgresql: listen: 0.0.0.0:5432 connect_address: 172.17.206.73:5432 # 替换为实际的IP地址 data_dir: /home/postgres/polar15 pgpass: /tmp/pgpass0 bin_dir: /u01/polardb_pg/bin/ authentication: superuser: username: postgres password: yourpassword replication: username: replicator password: pg_rep parameters: unix_socket_directories: '/tmp' tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
启动patroni
三个节点依次启动patroni进程,观察进程是否正常运行,如果异常退出可以在~/log/patroni.log文件中分析原因。
patroni ~/etc/postgres-pg01.yml > ~/log/patroni.log 2>&1 &
查看集群拓扑,发现patroni已经自动根据配置文件使用pg_basebackup创建了备库。
patronictl -c ~/etc/postgres-pg01.yml topology
后续可以使用patronictl来运维数据库,比如修改配置、重启数据库节点、主备切换等操作。还可以把patroni进程托管给systemd管理。
总结
本文简单介绍了通过patroni搭建PolarDB-PG高可用集群,如果希望在本文基础上做更多高可用方面的测试,建议根据需要对文中的配置进行调整。如果想避免私有化部署的运维成本,也可以选用线上PolarDB-PG产品进行试用,线上提供了更丰富的高可用能力,能够有效降低部署的工作量并提高稳定性。