PXC 5.7 WSREP_SST: [ERROR] xtrabackup_checkpoints missing

本文涉及的产品
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 高可用系列,价值2615元额度,1个月
简介:

前阵子在配置好了PXC5.7之后,在启动其中的一个节点,碰到了 [ERROR] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR。关于这个错误,需要从其它节点来获取更详细的日志描述。下文是对这个问题的描述及解决,供大家参考。

一、故障现象

PXC mysql 5.7某节点重启失败,下面是重启失败后的状态
当前环境
[root@node146 ~]# more /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@node146 ~]# mysql -V
mysql Ver 14.14 Distrib 5.7.19-17, for Linux (x86_64) using 6.2

[root@node146 ~]# systemctl status mysql.service
mysql.service - Percona XtraDB Cluster
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2017-12-29 09:12:13 CST; 44s ago
Process: 19192 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 19160 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 18153 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
Process: 18151 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=1/FAILURE)
Process: 18032 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 18151 (code=exited, status=1/FAILURE)
--下面开始输出错误信息
Dec 29 09:12:13 node146.example.com mysql-systemd[18153]: ERROR! mysqld_safe with PID 18151 has already exited: FAILURE
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=1
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: ERROR! mysql already dead
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=2
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql may be already dead
Dec 29 09:12:13 node146.example.com systemd[1]: Failed to start Percona XtraDB Cluster.
Dec 29 09:12:13 node146.example.com systemd[1]: Unit mysql.service entered failed state.
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service failed.

Dec 29 09:11:29 node146.example.com polkitd[18024]: Started polkitd version 0.112
Dec 29 09:11:29 node146.example.com polkitd[18024]: Loading rules from directory /etc/polkit-1/rules.d
Dec 29 09:11:29 node146.example.com polkitd[18024]: Loading rules from directory /usr/share/polkit-1/rules.d
Dec 29 09:11:29 node146.example.com polkitd[18024]: Finished loading, compiling and executing 6 rules
Dec 29 09:11:29 node146.example.com dbus[889]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'
Dec 29 09:11:29 node146.example.com dbus-daemon[889]: dbus[889]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'
Dec 29 09:11:29 node146.example.com systemd[1]: Started Authorization Manager.
-- Subject: Unit polkit.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit polkit.service has finished starting up.
--
-- The start-up result is done.
Dec 29 09:11:29 node146.example.com polkitd[18024]: Acquired the name org.freedesktop.PolicyKit1 on the system bus
Dec 29 09:11:29 node146.example.com gnome-session[2869]: PolicyKit daemon reconnected to bus.
Dec 29 09:11:29 node146.example.com gnome-session[2869]: Attempting to re-register as an authentication agent.
Dec 29 09:11:29 node146.example.com polkitd[18024]: Registered Authentication Agent for unix-session:c1 (system bus name :1.30 [gnome-shell --mode=gdm], object path /org/freedesktop/PolicyKit1/Authentication
Dec 29 09:11:29 node146.example.com gnome-session[2869]: We are now a registered authentication agent.
Dec 29 09:11:29 node146.example.com polkitd[18024]: Registered Authentication Agent for unix-process:18020:4562323 (system bus name :1.288 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/fre
Dec 29 09:11:29 node146.example.com systemd[1]: Starting Percona XtraDB Cluster...
-- Subject: Unit mysql.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has begun starting up.
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.538125Z mysqld_safe Logging to '/var/log/mysqld.log'.
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.540867Z mysqld_safe Logging to '/var/log/mysqld.log'.
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.565416Z mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.569662Z mysqld_safe Skipping wsrep-recover for empty datadir: /var/lib/mysql
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.570611Z mysqld_safe Assigning 00000000-0000-0000-0000-000000000000:-1 to wsrep_start_position
Dec 29 09:11:43 node146.example.com mysql-systemd[18153]: State transfer in progress, setting sleep higher
Dec 29 09:12:04 node146.example.com systemd[1]: mysql.service: main process exited, code=exited, status=1/FAILURE
Dec 29 09:12:13 node146.example.com mysql-systemd[18153]: /usr/bin/mysql-systemd: line 140: kill: (18151) - No such process
Dec 29 09:12:13 node146.example.com mysql-systemd[18153]: ERROR! mysqld_safe with PID 18151 has already exited: FAILURE
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=1
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: ERROR! mysql already dead
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=2
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql may be already dead
Dec 29 09:12:13 node146.example.com systemd[1]: Failed to start Percona XtraDB Cluster.
-- Subject: Unit mysql.service has failed ## Author : Leshami
-- Defined-By: systemd                           ## Blog : http://blog.csdn.net/leshami
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has failed.
--
-- The result is failed.
Dec 29 09:12:13 node146.example.com systemd[1]: Unit mysql.service entered failed state.
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service failed.
Dec 29 09:12:13 node146.example.com polkitd[18024]: Unregistered Authentication Agent for unix-process:18020:4562323 (system bus name :1.288, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, loca
lines 2162-2213/2213 (END)

下面查看mysqld的错误日志
[root@node146 ~]# tail -fn 100 /var/log/mysqld.log
2017-12-29T01:11:43.416689Z 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> aeb87793-ebb2-11e7-b33e-eeaf4988bbe4:0
2017-12-29T01:11:43.812807Z WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /var/lib/mysql//.sst from previous state transfer. Removing
2017-12-29T01:11:43.841926Z WSREP_SST: [INFO] Proceeding with SST.........
2017-12-29T01:11:43.867695Z WSREP_SST: [INFO] ............Waiting for SST streaming to complete!
2017-12-29T01:11:44.359210Z 0 [Note] WSREP: (3a4aef7c, 'tcp://0.0.0.0:4567') connection to peer 3a4aef7c with addr tcp://192.168.81.146:4567 timed out, no messages seen in PT3S
2017-12-29T01:11:44.359666Z 0 [Note] WSREP: (3a4aef7c, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-12-29T01:11:53.926533Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************  这里提示致命的错误,跟xtrabackup有关,提示检查 DONOR log
2017-12-29T01:11:53.927524Z WSREP_SST: [ERROR] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR. Check DONOR log
2017-12-29T01:11:53.928649Z WSREP_SST: [ERROR] ******************************************************
2017-12-29T01:11:53.932079Z WSREP_SST: [ERROR] Cleanup after exit with status:2
2017-12-29T01:11:53.942644Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.81.146' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '18614' '' : 2 (No such file or directory)
2017-12-29T01:11:53.942695Z 0 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2017-12-29T01:11:53.942703Z 0 [ERROR] WSREP: SST script aborted with error 2 (No such file or directory)
2017-12-29T01:11:53.942729Z 0 [ERROR] WSREP: SST failed: 2 (No such file or directory)
2017-12-29T01:11:53.942735Z 0 [ERROR] Aborting

ssh到另外一个节点1.142上,在mysql数据目录下,查看innobackup日志
[root@node142 ~]# more /var/lib/mysql/innobackup.backup.log
171229 09:11:47 version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'sstuser' (using password: YES).
Failed to connect to MySQL server: DBI connect(';mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock','sstuser',...) failed: Access denied for user 'sstuser'@'localhost' (using password: YES) at - line 1314. --提示访问被拒绝,晕,账户明明已经被创建过了,不知道是不是误删了或者账户被修改。
171229 09:11:47 Connecting to MySQL server host: localhost, user: sstuser, password: set, port: not set, socket: /var/lib/mysql/mysql.sock
Failed to connect to MySQL server: Access denied for user 'sstuser'@'localhost' (using password: YES).

二、故障解决

从上面的错误可知,用于SST同步的用户访问被拒绝,测试SST用户能否登陆,如下,确实无法登陆
[root@node142 mysql]# mysql -usstuser -ps3cretPass
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'sstuser'@'localhost' (using password: YES)

--下面创建这个
[root@node142 mysql]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 26
Server version: 5.7.19-17-57 Percona XtraDB Cluster (GPL), Release rel17, Revision 35cdc81, WSREP version 29.22, wsrep_29.22

Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> alter user 'sstuser'@'localhost' identified by 's3cretPass';  --为账户修改密码
Query OK, 0 rows affected (0.05 sec)

mysql> exit
Bye
[root@node142 mysql]# mysql -usstuser -ps3cretPass --再次登陆成功
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 27
Server version: 5.7.19-17-57 Percona XtraDB Cluster (GPL), Release rel17, Revision 35cdc81, WSREP version 29.22, wsrep_29.22

Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> exit
Bye
接下来重启1.146 IP节点成功。

三、更多参考

基于CentOS 7 安装Percona XtraDB Cluster(PXC) 5.7
MySQL 5.7 时间显示修改(log_timestamps UTC)
MySQL PXC 5.7 invalid user‘@MYSQLD_USER@’

DBA牛鹏社(SQL/NOSQL/LINUX)

相关实践学习
如何快速连接云数据库RDS MySQL
本场景介绍如何通过阿里云数据管理服务DMS快速连接云数据库RDS MySQL,然后进行数据表的CRUD操作。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
目录
相关文章
|
7月前
|
TensorFlow 算法框架/工具
【ERROR】FailedPreconditionError: checkpoint is not a directory
【ERROR】FailedPreconditionError: checkpoint is not a directory
103 1
|
8月前
|
数据库 OceanBase
min restore scn of backup set file is greater than restore scn. can't use to restor
min restore scn of backup set file is greater than restore scn. can't use to restor
55 1
|
关系型数据库 数据库
Harbor断电重启postgres报错 could not locate a valid checkpoint record
Harbor断电重启postgres报错 could not locate a valid checkpoint record
443 0
|
存储 关系型数据库 MySQL
mysql 数据库无法启动(Ignoring the redo log due to missing MLOG_CHECKPOINT between the checkpoint .... and)
数据库机器的CPU和主板都换了,重新开机,发现mysql数据库无法启动!
442 0
|
关系型数据库 MySQL RDS
Using XtraBackup for Physical Backup and Restoration
Combine physical and logical database backup on Alibaba Cloud ApsaraDB for RDS with Percona XtraBackup and mysqldump.
3108 0
Using XtraBackup for Physical Backup and Restoration
|
关系型数据库 MySQL
mysqld: File './binlog.~rec~' not found 问题解决处理
之前搭建了mysql主从同步服务器,之前启动正常,但今天在重启了服务器后,发现一直报 mysql pid mysqld_safe mysqld from pid file  hostname.pid ended 这实际是个错误,说明mysql启动时找*.pid文件发生错误。
2833 0