PXC 5.7 WSREP_SST: [ERROR] xtrabackup_checkpoints missing

本文涉及的产品
RDS MySQL DuckDB 分析主实例,基础系列 4核8GB
RDS AI 助手,专业版
RDS MySQL DuckDB 分析主实例,集群系列 4核8GB
简介:

前阵子在配置好了PXC5.7之后,在启动其中的一个节点,碰到了 [ERROR] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR。关于这个错误,需要从其它节点来获取更详细的日志描述。下文是对这个问题的描述及解决,供大家参考。

一、故障现象

PXC mysql 5.7某节点重启失败,下面是重启失败后的状态
当前环境
[root@node146 ~]# more /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)
[root@node146 ~]# mysql -V
mysql Ver 14.14 Distrib 5.7.19-17, for Linux (x86_64) using 6.2

[root@node146 ~]# systemctl status mysql.service
mysql.service - Percona XtraDB Cluster
Loaded: loaded (/usr/lib/systemd/system/mysql.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Fri 2017-12-29 09:12:13 CST; 44s ago
Process: 19192 ExecStopPost=/usr/bin/mysql-systemd stop-post (code=exited, status=0/SUCCESS)
Process: 19160 ExecStop=/usr/bin/mysql-systemd stop (code=exited, status=2)
Process: 18153 ExecStartPost=/usr/bin/mysql-systemd start-post $MAINPID (code=exited, status=1/FAILURE)
Process: 18151 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=1/FAILURE)
Process: 18032 ExecStartPre=/usr/bin/mysql-systemd start-pre (code=exited, status=0/SUCCESS)
Main PID: 18151 (code=exited, status=1/FAILURE)
--下面开始输出错误信息
Dec 29 09:12:13 node146.example.com mysql-systemd[18153]: ERROR! mysqld_safe with PID 18151 has already exited: FAILURE
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=1
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: ERROR! mysql already dead
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=2
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql may be already dead
Dec 29 09:12:13 node146.example.com systemd[1]: Failed to start Percona XtraDB Cluster.
Dec 29 09:12:13 node146.example.com systemd[1]: Unit mysql.service entered failed state.
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service failed.

Dec 29 09:11:29 node146.example.com polkitd[18024]: Started polkitd version 0.112
Dec 29 09:11:29 node146.example.com polkitd[18024]: Loading rules from directory /etc/polkit-1/rules.d
Dec 29 09:11:29 node146.example.com polkitd[18024]: Loading rules from directory /usr/share/polkit-1/rules.d
Dec 29 09:11:29 node146.example.com polkitd[18024]: Finished loading, compiling and executing 6 rules
Dec 29 09:11:29 node146.example.com dbus[889]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'
Dec 29 09:11:29 node146.example.com dbus-daemon[889]: dbus[889]: [system] Successfully activated service 'org.freedesktop.PolicyKit1'
Dec 29 09:11:29 node146.example.com systemd[1]: Started Authorization Manager.
-- Subject: Unit polkit.service has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit polkit.service has finished starting up.
--
-- The start-up result is done.
Dec 29 09:11:29 node146.example.com polkitd[18024]: Acquired the name org.freedesktop.PolicyKit1 on the system bus
Dec 29 09:11:29 node146.example.com gnome-session[2869]: PolicyKit daemon reconnected to bus.
Dec 29 09:11:29 node146.example.com gnome-session[2869]: Attempting to re-register as an authentication agent.
Dec 29 09:11:29 node146.example.com polkitd[18024]: Registered Authentication Agent for unix-session:c1 (system bus name :1.30 [gnome-shell --mode=gdm], object path /org/freedesktop/PolicyKit1/Authentication
Dec 29 09:11:29 node146.example.com gnome-session[2869]: We are now a registered authentication agent.
Dec 29 09:11:29 node146.example.com polkitd[18024]: Registered Authentication Agent for unix-process:18020:4562323 (system bus name :1.288 [/usr/bin/pkttyagent --notify-fd 5 --fallback], object path /org/fre
Dec 29 09:11:29 node146.example.com systemd[1]: Starting Percona XtraDB Cluster...
-- Subject: Unit mysql.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has begun starting up.
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.538125Z mysqld_safe Logging to '/var/log/mysqld.log'.
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.540867Z mysqld_safe Logging to '/var/log/mysqld.log'.
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.565416Z mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.569662Z mysqld_safe Skipping wsrep-recover for empty datadir: /var/lib/mysql
Dec 29 09:11:40 node146.example.com mysqld_safe[18151]: 2017-12-29T01:11:40.570611Z mysqld_safe Assigning 00000000-0000-0000-0000-000000000000:-1 to wsrep_start_position
Dec 29 09:11:43 node146.example.com mysql-systemd[18153]: State transfer in progress, setting sleep higher
Dec 29 09:12:04 node146.example.com systemd[1]: mysql.service: main process exited, code=exited, status=1/FAILURE
Dec 29 09:12:13 node146.example.com mysql-systemd[18153]: /usr/bin/mysql-systemd: line 140: kill: (18151) - No such process
Dec 29 09:12:13 node146.example.com mysql-systemd[18153]: ERROR! mysqld_safe with PID 18151 has already exited: FAILURE
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=1
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19160]: ERROR! mysql already dead
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service: control process exited, code=exited status=2
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql pid file /var/run/mysqld/mysqld.pid empty or not readable
Dec 29 09:12:13 node146.example.com mysql-systemd[19192]: WARNING: mysql may be already dead
Dec 29 09:12:13 node146.example.com systemd[1]: Failed to start Percona XtraDB Cluster.
-- Subject: Unit mysql.service has failed ## Author : Leshami
-- Defined-By: systemd                           ## Blog : http://blog.csdn.net/leshami
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
--
-- Unit mysql.service has failed.
--
-- The result is failed.
Dec 29 09:12:13 node146.example.com systemd[1]: Unit mysql.service entered failed state.
Dec 29 09:12:13 node146.example.com systemd[1]: mysql.service failed.
Dec 29 09:12:13 node146.example.com polkitd[18024]: Unregistered Authentication Agent for unix-process:18020:4562323 (system bus name :1.288, object path /org/freedesktop/PolicyKit1/AuthenticationAgent, loca
lines 2162-2213/2213 (END)

下面查看mysqld的错误日志
[root@node146 ~]# tail -fn 100 /var/log/mysqld.log
2017-12-29T01:11:43.416689Z 2 [Note] WSREP: GCache history reset: 00000000-0000-0000-0000-000000000000:0 -> aeb87793-ebb2-11e7-b33e-eeaf4988bbe4:0
2017-12-29T01:11:43.812807Z WSREP_SST: [INFO] WARNING: Stale temporary SST directory: /var/lib/mysql//.sst from previous state transfer. Removing
2017-12-29T01:11:43.841926Z WSREP_SST: [INFO] Proceeding with SST.........
2017-12-29T01:11:43.867695Z WSREP_SST: [INFO] ............Waiting for SST streaming to complete!
2017-12-29T01:11:44.359210Z 0 [Note] WSREP: (3a4aef7c, 'tcp://0.0.0.0:4567') connection to peer 3a4aef7c with addr tcp://192.168.81.146:4567 timed out, no messages seen in PT3S
2017-12-29T01:11:44.359666Z 0 [Note] WSREP: (3a4aef7c, 'tcp://0.0.0.0:4567') turning message relay requesting off
2017-12-29T01:11:53.926533Z WSREP_SST: [ERROR] ******************* FATAL ERROR **********************  这里提示致命的错误,跟xtrabackup有关,提示检查 DONOR log
2017-12-29T01:11:53.927524Z WSREP_SST: [ERROR] xtrabackup_checkpoints missing. xtrabackup/SST failed on DONOR. Check DONOR log
2017-12-29T01:11:53.928649Z WSREP_SST: [ERROR] ******************************************************
2017-12-29T01:11:53.932079Z WSREP_SST: [ERROR] Cleanup after exit with status:2
2017-12-29T01:11:53.942644Z 0 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '192.168.81.146' --datadir '/var/lib/mysql/' --defaults-file '/etc/my.cnf' --defaults-group-suffix '' --parent '18614' '' : 2 (No such file or directory)
2017-12-29T01:11:53.942695Z 0 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2017-12-29T01:11:53.942703Z 0 [ERROR] WSREP: SST script aborted with error 2 (No such file or directory)
2017-12-29T01:11:53.942729Z 0 [ERROR] WSREP: SST failed: 2 (No such file or directory)
2017-12-29T01:11:53.942735Z 0 [ERROR] Aborting

ssh到另外一个节点1.142上,在mysql数据目录下,查看innobackup日志
[root@node142 ~]# more /var/lib/mysql/innobackup.backup.log
171229 09:11:47 version_check Connecting to MySQL server with DSN 'dbi:mysql:;mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock' as 'sstuser' (using password: YES).
Failed to connect to MySQL server: DBI connect(';mysql_read_default_group=xtrabackup;mysql_socket=/var/lib/mysql/mysql.sock','sstuser',...) failed: Access denied for user 'sstuser'@'localhost' (using password: YES) at - line 1314. --提示访问被拒绝,晕,账户明明已经被创建过了,不知道是不是误删了或者账户被修改。
171229 09:11:47 Connecting to MySQL server host: localhost, user: sstuser, password: set, port: not set, socket: /var/lib/mysql/mysql.sock
Failed to connect to MySQL server: Access denied for user 'sstuser'@'localhost' (using password: YES).

二、故障解决

从上面的错误可知,用于SST同步的用户访问被拒绝,测试SST用户能否登陆,如下,确实无法登陆
[root@node142 mysql]# mysql -usstuser -ps3cretPass
mysql: [Warning] Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'sstuser'@'localhost' (using password: YES)

--下面创建这个
[root@node142 mysql]# mysql -uroot -p
Enter password:
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 26
Server version: 5.7.19-17-57 Percona XtraDB Cluster (GPL), Release rel17, Revision 35cdc81, WSREP version 29.22, wsrep_29.22

Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> alter user 'sstuser'@'localhost' identified by 's3cretPass';  --为账户修改密码
Query OK, 0 rows affected (0.05 sec)

mysql> exit
Bye
[root@node142 mysql]# mysql -usstuser -ps3cretPass --再次登陆成功
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 27
Server version: 5.7.19-17-57 Percona XtraDB Cluster (GPL), Release rel17, Revision 35cdc81, WSREP version 29.22, wsrep_29.22

Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> exit
Bye
接下来重启1.146 IP节点成功。

三、更多参考

基于CentOS 7 安装Percona XtraDB Cluster(PXC) 5.7
MySQL 5.7 时间显示修改(log_timestamps UTC)
MySQL PXC 5.7 invalid user‘@MYSQLD_USER@’

DBA牛鹏社(SQL/NOSQL/LINUX)

相关实践学习
每个IT人都想学的“Web应用上云经典架构”实战
本实验从Web应用上云这个最基本的、最普遍的需求出发,帮助IT从业者们通过“阿里云Web应用上云解决方案”,了解一个企业级Web应用上云的常见架构,了解如何构建一个高可用、可扩展的企业级应用架构。
MySQL数据库入门学习
本课程通过最流行的开源数据库MySQL带你了解数据库的世界。   相关的阿里云产品:云数据库RDS MySQL 版 阿里云关系型数据库RDS(Relational Database Service)是一种稳定可靠、可弹性伸缩的在线数据库服务,提供容灾、备份、恢复、迁移等方面的全套解决方案,彻底解决数据库运维的烦恼。 了解产品详情: https://www.aliyun.com/product/rds/mysql 
目录
相关文章
|
网络协议 MySQL 关系型数据库
|
NoSQL 关系型数据库 MySQL
Linux平台下安装部署MongoDB数据库及基本使用(1)
Linux平台下安装部署MongoDB数据库及基本使用(1)
735 0
|
4月前
|
缓存 前端开发 JavaScript
全栈管理系统:Node.js + Express + MySQL + React + Antd
基于 Node.js + React 的全栈管理系统,集成 JWT 认证、动态权限路由、SSE 实时通知与审计日志。支持单点登录、RBAC 权限控制及多环境部署,提供完整前后端分离架构与 Docker 一键部署方案,助力开发者快速构建企业级应用。
|
SQL 安全 数据库
sql清空数据库
在SQL中,清空数据库通常意味着删除数据库中的所有数据,而不是删除数据库本身。如果你想要删除整个数据库,那将是另一个操作。以下是一些常见的SQL命令,用于清空数据库中的数据,具体取决于你使用的数据库系
|
8月前
|
Kubernetes Linux 网络安全
Rocky Linux 8.9配置Kubernetes集群详解,适用于CentOS环境
初始化成功后,记录下显示的 `kubeadm join`命令。
519 0
|
8月前
|
应用服务中间件 nginx Docker
静态资源管理:Nginx在Docker中的部署
部署Nginx到Docker中作为静态资源服务器是一种既简单又高效的方法,可以节省时间和资源,并能确保一致性和可扩展性。我们通过编写Dockerfile指定了基础镜像和所需指令,编写Nginx配置管理请求处理,构建自定义Docker镜像,并运行容器以启动服务。这一过程即符合开发规范,也保证了资源的高效管理和访问速度。
399 13
|
Java Linux 开发工具
Linux部署 Gerrit 教程 ——2023.07
Linux部署 Gerrit 教程 ——2023.07
1230 0
|
Kubernetes 网络协议 容器
在K8S中,Pod的健康检查方式有哪些?
在K8S中,Pod的健康检查方式有哪些?
|
NoSQL Linux Redis
Redis 6.X Sentinel 哨兵集群搭建
Redis 6.X Sentinel 哨兵集群搭建
408 5

热门文章

最新文章