MySQL安装
问题汇总
Q1:登陆提示Access denied for user ‘root’@‘localhost’ (using password: YES)
- 在配置文件my.cnf/my.ini添加配置skip-grant-tables,跳过权限校验。
- 修改密码UPDATE mysql.user SET authentication_string = “*81F5E21E35407D884A6CD4A731AEBFB6AF209E1B”,password_expired=“N” WHERE User=“root” and Host=“localhost”;
Q2:主备同步的时候,由于Master执行的数据在Master中不存在或者异常通信导致数据不一致,MHA检测主从状态出现以下问题:Could not execute Update_rows event on table oa.bui_bill_sum; Can’t find record in ‘bui_bill_sum’, Error_code: 1032; handler error HA_ERR_KEY_NOT_FOUND; the event’s master log mysql-bin.000138, end_log_pos 160051747
# 在Slave中执行以下SQL,需要登陆到Mysql STOP SLAVE; SET GLOBAL sql_slave_skip_counter =1; #表示跳过一步错误,后面的数字可变 START SLAVE;
环境说明
- Linux CentOS7服务器
- MySQL 5.7 RPM mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar
- 防火墙关闭
CentOS 6 处理方法
//临时关闭 service iptables stop //禁止开机启动 chkconfig iptables off
CentOS 7 处理方法
CentOS 7版本以后防火墙默认使用firewalld
,所以CentOS 7 防火墙命令和CentOS 6 存在差异。
//临时关闭 systemctl stop firewalld //禁止开机启动 systemctl disable firewalld.service Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
如果安装了iptables-service选择方案
//安装ip服务 yum install -y iptables-services //关闭防火墙 service iptables stop Redirecting to /bin/systemctl stop iptables.service //检查防火墙状态 service iptables status Redirecting to /bin/systemctl status iptables.service iptables.service - IPv4 firewall with iptables Loaded: loaded (/usr/lib/systemd/system/iptables.service; disabled; vendor preset: disabled) Active: inactive (dead)
MySQL RPM安装
检查是否已经安装过MySQL
rpm -qa | grep mysql # 如果没有打印就是没有安装 # 如果安装过了,那么可以卸载系统自带的 rpm -e mysql # 普通删除模式 rpm -e --nodeps mysql # 强力删除模式,如果使用上面的删除提示有其他依赖文件的时候,使用该命令可以强力删除
检查CentOS 7是否默认自带mariadb
# 检查mariadb是否安装 rpm -aq | grep mariadb # 如果安装过,会显示如下 mariadb-libs-5.5.60-1.el7_5.x86_64 # 卸载 --nodeps参数作用是如果有依赖,会强力删除,如果不加可能会提示失败 rpm -e --nodeps mariadb-libs-5.5.60-1.el7_5.x86_64
上传RPM包
# 这里使用rz上传文件,如果没有安装这个命令可以使用以下命令安装,也可以使用Xftp客户端工具上传 yum install lrzsz # 上传文件 rz # 打开文件夹选择窗口,上传文件到用户的home目录
解压RPM
tar -xvf mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar # 解压之后可以看到有多个RPM安装包 -rw-------. 1 root root 1455 10月 2 01:00 anaconda-ks.cfg -rw-r--r--. 1 root root 609556480 9月 16 09:49 mysql-5.7.28-1.el7.x86_64.rpm-bundle.tar -rw-r--r--. 1 7155 31415 45109364 9月 30 2019 mysql-community-client-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 318768 9月 30 2019 mysql-community-common-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 7037096 9月 30 2019 mysql-community-devel-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 49329100 9月 30 2019 mysql-community-embedded-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 23354908 9月 30 2019 mysql-community-embedded-compat-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 136837816 9月 30 2019 mysql-community-embedded-devel-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 4374364 9月 30 2019 mysql-community-libs-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 1353312 9月 30 2019 mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 208694824 9月 30 2019 mysql-community-server-5.7.28-1.el7.x86_64.rpm -rw-r--r--. 1 7155 31415 133129992 9月 30 2019 mysql-community-test-5.7.28-1.el7.x86_64.rpm
安装RPM
# 安装命令 rpm -ivh mysql-community-common-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-libs-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-client-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-server-5.7.28-1.el7.x86_64.rpm rpm -ivh mysql-community-devel-5.7.28-1.el7.x86_64.rpm
MySQL的RPM安装需要注意依赖关系,不是随便安装就可以的。RPM的安装顺序如下:
- mysql-community-common-5.7.28-1.el7.x86_64.rpm
- mysql-community-libs-5.7.28-1.el7.x86_64.rpm
- mysql-community-libs-compat-5.7.28-1.el7.x86_64.rpm (安装完这三个,其他的可以随意顺序安装)
- mysql-community-client-5.7.28-1.el7.x86_64.rpm
- mysql-community-server-5.7.28-1.el7.x86_64.rpm
- mysql-community-devel-5.7.28-1.el7.x86_64.rpm(开发工具)
初始化数据库实例
mysqld --initialize --user=mysql
初始化数据库后,MySQL会创建一个root用户,但是密码是随机生成的,需要查看密码
cat /var/log/mysqld.log # 显示结果如下 PkoI(_uFw3&t就是密码 2020-10-02T16:49:58.823691Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details). 2020-10-02T16:49:59.143270Z 0 [Warning] InnoDB: New log files created, LSN=45790 2020-10-02T16:49:59.207287Z 0 [Warning] InnoDB: Creating foreign key constraint system tables. 2020-10-02T16:49:59.338916Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 4ee64a7b-04cf-11eb-b5d0-000c29fb7e1e. 2020-10-02T16:49:59.342706Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened. 2020-10-02T16:49:59.827830Z 0 [Warning] CA certificate ca.pem is self signed. 2020-10-02T16:50:00.286400Z 1 [Note] A temporary password is generated for root@localhost: PkoI(_uFw3&t
MySQL启动
# 将MySQL作为服务注册到系统,跟随系统启动 systemctl start mysqld.service # 查看状态 systemctl status mysqld.service # 结果如下 mysqld.service - MySQL Server Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled) Active: active (running) since 六 2020-10-03 01:06:22 CST; 15s ago Docs: man:mysqld(8) http://dev.mysql.com/doc/refman/en/using-systemd.html Process: 37965 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS) Process: 37948 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS) Main PID: 37969 (mysqld) CGroup: /system.slice/mysqld.service └─37969 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid 10月 03 01:06:21 localhost.localdomain systemd[1]: Starting MySQL Server... 10月 03 01:06:22 localhost.localdomain systemd[1]: Started MySQL Server.
登陆MySQL
mysql -uroot -p # 重新设置root密码 set password=password('root'); # 推出 exit # 重新登陆检查密码是否正确 mysql -uroot -p
MySQL集群之主从复制
Master配置
配置文件为 /etc/my.cnf
# 基础配置只需要log_bin和server-id参数 log_bin=mysql-bin # 开启binlog server-id=1 # 同一个集群要唯一 sync-binlog=1 # 每次执行都与磁盘同步 # 忽略某些数据库不需要同步 binlog-ignore-db=performance_schema binlog-ignore-db=information_schema binlog-ignore-db=sys # 指定某些数据库需要同步 binlog-do-db=lagou
重启Mysql systemctl restart mysqld;
登陆主库授权从库复制操作
# 对从库的用户授权 grant replication slave on *.* to 'root'@'%' identified by 'root'; grant all privileges on *.* to 'root'@'*' identified by 'root'; # 刷新权限 flush privileges; # 查看主库状态: show master status;
Slave配置
# 登陆mysql 查看slave状态,如果开启需要停止 stop slave; show slave status; start slave; stop slave;
配置文件为 /etc/my.cnf
# 从库读取主库binlog,所以从库binlog可以不开启 server-id=2 # 指定主机日志名称/权限等 relay_log=mysql-relay-bin read_only=1
重启Mysql systemctl restart mysqld;
登陆mysql同步初始化命令,mysql环境下
# 从库从那个主库的那个日志文件的那个位置开始同步; # master_log_file同步的日志文件 # master_log_pos同步开始的位置 change master to master_host='192.168.247.120',master_port=3306,master_user='root',master_password='root',master_log_file='mysql-bin.000002',master_log_pos=869;
# 开启slave start slave;
MySQL集群之半同步复制
主库和从库的插件安装
# 登陆数据库后执行,查看是否支持插件的安装 select @@have_dynamic_loading; # 查看已经安装了那些插件,如果没有semi就动态安装 show plugins;
安装semi插件
# master安装 soname是给插件一个名称 install plugin rpl_semi_sync_master soname 'semisync_master.so'; # 查看半同步复制相关的参数,默认没有开启,相关参数需要设置 show variables like '%semi%'; set global rpl_semi_sync_master_enabled=1; # 开启 set global rpl_semi_sync_master_timeout=1000; # 超时时间1秒 # slave安装 soname是给插件一个名称 install plugin rpl_semi_sync_slave soname 'semisync_slave.so'; # 查看半同步复制相关的参数,默认没有开启,相关参数需要设置 show variables like '%semi%'; set global rpl_semi_sync_slave_enabled=1; # 开启 # 重启slave stop slave; start slave;
查看是否使用半同步复制
# 在Master端查看日志 cat /var/log/mysqld.log # 如果看到以下信息: semi-sync等就是使用了半同步
MySQL集群之并行复制
并行复制主要是通过MySQL的配置开启
MySQL 5.6并行复制
暂无
MySQL 5.7 并行复制
# Master的并行复制参数 show variables like '%binlog_group%'; # 组同步提交延迟 set global binlog_group_commit_sync_delay=1000; # 一组最大多少个事务数 set global binlog_group_commit_sync_no_delay_count=100; # Slave并行复制参数 show variables like '%slave%'; # 停止slave stop slave; # 设置参数slave_parallel_type DATABASE是兼容5.6的基于数据库的设置 set global slave-parallel-type='LOGICAL_CLOCK'; # 工作线程数 set global slave-parallel-workers=8; # Slave中继日志 show variables like '%relay_log%'; # 日志使用TABLE效率更高可以提升50-80%,这些参数如果是只读,进入/etc/my.cnf配置 set global relay_log_info_repository='TABLE'; # master_info_repository参数自行选择,设置为TABLE效率更高 set global master_info_repository='TABLE'; set global relay_log_recovery=1; # my.cnf配置文件添加,有的参数set命令设置了有效,但是重启后就被重置了,保守还是使用配置文件比较稳妥 slave-parallel-type=LOGICAL_CLOCK slave-parallel-workers=8 relay_log_info_repository=TABLE master_info_repository=TABLE relay_log_recovery=1 # 重启MySQL systemctl restart mysqld
MHA高可用搭建
服务器IP | 角色 |
192.168.247.120 | MySQL数据库Master |
192.168.247.130 | MySQL数据库Slave01 |
192.168.247.131 | MySQL数据库Slave02 |
192.168.247.140 | MHA Manager管理服务器 |
配置关键程序软连接(非RPM安装需要)
# 源文件路径自己根据需要修改,后面的路径必须保持一致 ln -s /data/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog ln -s /data/mysql/bin/mysql /usr/bin/mysql
配置各节点互相
# 只需要一个节点设置就可以 192.168.247.140 # 删除原来的证书 rm -rf /root/.ssh # 生成密钥 ssh-keygen cd /root/.ssh # 重命名 mv id_rsa.pub authorized_keys # 发送证书到其他节点 scp -r /root/.ssh 192.168.247.120:/root scp -r /root/.ssh 192.168.247.130:/root scp -r /root/.ssh 192.168.247.131:/root # 所有节点验证,不需要输入密码 # MHA: ssh 192.168.247.140 date ssh 192.168.247.120 date ssh 192.168.247.130 date ssh 192.168.247.131 date # Master: ssh 192.168.247.140 date ssh 192.168.247.120 date ssh 192.168.247.130 date ssh 192.168.247.131 date # Slave01: ssh 192.168.247.140 date ssh 192.168.247.120 date ssh 192.168.247.130 date ssh 192.168.247.131 date # Slave02: ssh 192.168.247.140 date ssh 192.168.247.120 date ssh 192.168.247.130 date ssh 192.168.247.131 date
MHA下载与安装
RPM | 安装方 | 下载地址 |
mha4mysql-manager-0.58-0.el7.centos.noarch.rpm | MHA服务器 | https://github.com/yoshinorim/mha4mysql-manager/releases/tag/v0.58 |
mha4mysql-node-0.58-0.el7.centos.noarch.rpm | MySQL服务器 | https://github.com/yoshinorim/mha4mysql-node/releases/tag/v0.58 |
MySQL 从数据库修改配置
# 开启二进制日志,因为如果切换为主服务器就需要这个 log_bin=mysql-bin sync-binlog=1 #每次写入都同步到binlog binlog_format=ROW binlog-ignore-db=performance_schema #忽略不同步 binlog-ignore-db=information_schema binlog-ignore-db=sys
所有节点安装Node软件依赖包
# 安装依赖包 yum install perl-DBD-MySQL -y # 安装node rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
MHA安装Manager软件
# 安装依赖 yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes # 安装软件包 rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
创建MHA专用监控和管理用户(主库执行)
# 主库执行是由于自动同步到其他从库 grant all privileges on *.* to mha@'192.168.247.%' identified by 'mha';
MHA Manager配置文件
# 创建配置文件目录 mkdir -p /etc/mha # 创建日志目录 mkdir -p /var/log/mha/app1 # 编辑mha配置文件 vim /etc/mha/app1.cnf [server default] manager_log=/var/log/mha/app1/manager manager_workdir=/var/log/mha/app1 # 主数据库服务器的二进制日志路径 master_binlog_dir=/var/lib/mysql user=mha password=mha # 一秒ping多少次 ping_interval=2 repl_password=root repl_user=root ssh_user=root [server1] hostname=192.168.247.120 candidate_master=1 port=3306 [server2] hostname=192.168.247.130 candidate_master=1 port=3306 [server3] hostname=192.168.247.131 candidate_master=1 port=3306
状态检测
# 互信检查 masterha_check_ssh --conf=/etc/mha/app1.cnf [root@localhost ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf Mon Oct 5 02:36:33 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Oct 5 02:36:33 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Oct 5 02:36:33 2020 - [info] Reading server configuration from /etc/mha/app1.cnf.. Mon Oct 5 02:36:33 2020 - [info] Starting SSH connection tests.. Mon Oct 5 02:36:36 2020 - [debug] Mon Oct 5 02:36:34 2020 - [debug] Connecting via SSH from root@192.168.247.130(192.168.247.130:22) to root@192.168.247.120(192.168.247.120:22).. Mon Oct 5 02:36:35 2020 - [debug] ok. Mon Oct 5 02:36:35 2020 - [debug] Connecting via SSH from root@192.168.247.130(192.168.247.130:22) to root@192.168.247.131(192.168.247.131:22).. Mon Oct 5 02:36:35 2020 - [debug] ok. Mon Oct 5 02:36:36 2020 - [debug] Mon Oct 5 02:36:33 2020 - [debug] Connecting via SSH from root@192.168.247.120(192.168.247.120:22) to root@192.168.247.130(192.168.247.130:22).. Mon Oct 5 02:36:34 2020 - [debug] ok. Mon Oct 5 02:36:34 2020 - [debug] Connecting via SSH from root@192.168.247.120(192.168.247.120:22) to root@192.168.247.131(192.168.247.131:22).. Mon Oct 5 02:36:35 2020 - [debug] ok. Mon Oct 5 02:36:37 2020 - [debug] Mon Oct 5 02:36:34 2020 - [debug] Connecting via SSH from root@192.168.247.131(192.168.247.131:22) to root@192.168.247.120(192.168.247.120:22).. Mon Oct 5 02:36:35 2020 - [debug] ok. Mon Oct 5 02:36:35 2020 - [debug] Connecting via SSH from root@192.168.247.131(192.168.247.131:22) to root@192.168.247.130(192.168.247.130:22).. Mon Oct 5 02:36:36 2020 - [debug] ok. Mon Oct 5 02:36:37 2020 - [info] All SSH connection tests passed successfully.
主从状态检测
# 注意,如果None of slaves can be master. Check failover configuration file or log-bin settings in my.cnf错误,那么就是从库没有开启二进制日志文件导致的。 # 检测主从数据库服务器状态 masterha_check_repl --conf=/etc/mha/app1.cnf # 执行成功日志 [root@localhost ~]# masterha_check_repl --conf=/etc/mha/app1.cnf Mon Oct 5 02:47:13 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Oct 5 02:47:13 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf.. Mon Oct 5 02:47:13 2020 - [info] Reading server configuration from /etc/mha/app1.cnf.. Mon Oct 5 02:47:13 2020 - [info] MHA::MasterMonitor version 0.58. Mon Oct 5 02:47:14 2020 - [info] GTID failover mode = 0 Mon Oct 5 02:47:14 2020 - [info] Dead Servers: Mon Oct 5 02:47:14 2020 - [info] Alive Servers: Mon Oct 5 02:47:14 2020 - [info] 192.168.247.120(192.168.247.120:3306) Mon Oct 5 02:47:14 2020 - [info] 192.168.247.130(192.168.247.130:3306) Mon Oct 5 02:47:14 2020 - [info] 192.168.247.131(192.168.247.131:3306) Mon Oct 5 02:47:14 2020 - [info] Alive Slaves: Mon Oct 5 02:47:14 2020 - [info] 192.168.247.130(192.168.247.130:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled Mon Oct 5 02:47:14 2020 - [info] Replicating from 192.168.247.120(192.168.247.120:3306) Mon Oct 5 02:47:14 2020 - [info] 192.168.247.131(192.168.247.131:3306) Version=5.7.28-log (oldest major version between slaves) log-bin:enabled Mon Oct 5 02:47:14 2020 - [info] Replicating from 192.168.247.120(192.168.247.120:3306) Mon Oct 5 02:47:14 2020 - [info] Current Alive Master: 192.168.247.120(192.168.247.120:3306) Mon Oct 5 02:47:14 2020 - [info] Checking slave configurations.. Mon Oct 5 02:47:14 2020 - [warning] relay_log_purge=0 is not set on slave 192.168.247.130(192.168.247.130:3306). Mon Oct 5 02:47:14 2020 - [warning] relay_log_purge=0 is not set on slave 192.168.247.131(192.168.247.131:3306). Mon Oct 5 02:47:14 2020 - [info] Checking replication filtering settings.. Mon Oct 5 02:47:14 2020 - [info] binlog_do_db= , binlog_ignore_db= information_schema,performance_schema,sys Mon Oct 5 02:47:14 2020 - [info] Replication filtering check ok. Mon Oct 5 02:47:14 2020 - [info] GTID (with auto-pos) is not supported Mon Oct 5 02:47:14 2020 - [info] Starting SSH connection tests.. Mon Oct 5 02:47:17 2020 - [info] All SSH connection tests passed successfully. Mon Oct 5 02:47:17 2020 - [info] Checking MHA Node version.. Mon Oct 5 02:47:18 2020 - [info] Version check ok. Mon Oct 5 02:47:18 2020 - [info] Checking SSH publickey authentication settings on the current master.. Mon Oct 5 02:47:18 2020 - [info] HealthCheck: SSH to 192.168.247.120 is reachable. Mon Oct 5 02:47:19 2020 - [info] Master MHA Node version is 0.58. Mon Oct 5 02:47:19 2020 - [info] Checking recovery script configurations on 192.168.247.120(192.168.247.120:3306).. Mon Oct 5 02:47:19 2020 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=mysql-bin.000001 Mon Oct 5 02:47:19 2020 - [info] Connecting to root@192.168.247.120(192.168.247.120:22).. Creating /var/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000001 Mon Oct 5 02:47:19 2020 - [info] Binlog setting check done. Mon Oct 5 02:47:19 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Oct 5 02:47:19 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.247.130 --slave_ip=192.168.247.130 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Oct 5 02:47:19 2020 - [info] Connecting to root@192.168.247.130(192.168.247.130:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to mysql-relay-bin.000005 Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000005 Checking if super_read_only is defined and turned on.. not present or turned off, ignoring. Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Oct 5 02:47:20 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.247.131 --slave_ip=192.168.247.131 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.28-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Mon Oct 5 02:47:20 2020 - [info] Connecting to root@192.168.247.131(192.168.247.131:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to mysql-relay-bin.000005 Temporary relay log file is /var/lib/mysql/mysql-relay-bin.000005 Checking if super_read_only is defined and turned on.. not present or turned off, ignoring. Testing mysql connection and privileges.. mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Oct 5 02:47:20 2020 - [info] Slaves settings check done. Mon Oct 5 02:47:20 2020 - [info] 192.168.247.120(192.168.247.120:3306) (current master) +--192.168.247.130(192.168.247.130:3306) +--192.168.247.131(192.168.247.131:3306) Mon Oct 5 02:47:20 2020 - [info] Checking replication health on 192.168.247.130.. Mon Oct 5 02:47:20 2020 - [info] ok. Mon Oct 5 02:47:20 2020 - [info] Checking replication health on 192.168.247.131.. Mon Oct 5 02:47:20 2020 - [info] ok. Mon Oct 5 02:47:20 2020 - [warning] master_ip_failover_script is not defined. Mon Oct 5 02:47:20 2020 - [warning] shutdown_script is not defined. Mon Oct 5 02:47:20 2020 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
开启MHA
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 &
查看MHA状态
# 执行命令 masterha_check_status --conf=/etc/mha/app1.cnf app1 (pid:37744) is running(0:PING_OK), master:192.168.247.120 # 查看监控的机器具体信息 mysql -umha -pmha -h 192.168.247.120 -e "show variables like 'server_id'"
故障模拟
# Msater主服务器杀掉mysql pkill mysqld; # Slave执行查看从库状态是否已经没了,其中一个从库应该是变更为主库,可以看Master_Host显示的主机IP show slave status \G; # MHA服务器查看状态,应该是不执行了 masterha_check_status --conf=/etc/mha/app1.cnf # 结果 主机挂了,变为一主一从,不能满足MHA一主两从要求,所以MHA不再管理APP1集群 [root@localhost ~]# masterha_check_status --conf=/etc/mha/app1.cnf app1 is stopped(2:NOT_RUNNING). [1]+ 完成 nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 # 查看/etc/mha/app1.cnf配置是否少了一个配置 cat /etc/mha/app1.cnf # 结果,少了192.168.247.120节点信息 [root@localhost ~]# cat /etc/mha/app1.cnf [server default] manager_log=/var/log/mha/app1/manager manager_workdir=/var/log/mha/app1 master_binlog_dir=/var/lib/mysql password=mha ping_interval=2 repl_password=root repl_user=root ssh_user=root user=mha [server2] candidate_master=1 hostname=192.168.247.130 port=3306 [server3] candidate_master=1 hostname=192.168.247.131 port=3306 # 查看日志 vim /var/log/mha/app1/manager
故障修复
- 启动之前的主库 systemctl restart mysqld;
- 修复主从 将重新启动的数据库弄到新的主库中作为从库,查看MHA日志cat /var/log/mha/app1/manager可以找到以下信息
Mon Oct 5 16:21:26 2020 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='192.168.247.130', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=154, MASTER_USER='root', MASTER_PASSWORD='xxx'; # 提取信息在重启的数据库中(192.168.247.120)执行 CHANGE MASTER TO MASTER_HOST='192.168.247.130', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000002', MASTER_LOG_POS=154, MASTER_USER='root', MASTER_PASSWORD='root'; # 启动slave start slave; # 作为从库后,执行查看从库结果 show slave status \G; # 发现主库变为了192.168.247.130 这样就加入了集群
- 修改MHA配置,添加192.168.247.120服务器信息 vim /etc/mha/app1.cnf
[server1] candidate_master=1 hostname=192.168.247.120 port=3306
- 重启MHA
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null> /var/log/mha/app1/manager.log 2>&1 & # 查看MHA状态 masterha_check_status --conf=/etc/mha/app1.cnf # 暂停MHA masterha_stop --conf=/etc/mha/app1.cnf
- 手动切换主备
如果MHA开启,那么默认情况下主备可以自动切换,当主机故障的时候,MHA将自动将备机切换为Master。
需要手动切换主备的情况
1)MHA故障,在这个时候Master同时也故障,就需要在Slave中选择一台作为Master。
2)Master和Slave正常运行,这时候需要添加一台服务器,并且将这台服务器作为Master,替代之前的Master机器。
# 情况一的操作手法: # 暂停MHA masterha_stop --conf=/etc/mha/app1.cnf # 暂停Master,这里是模拟Master故障停机了 # Msater主服务器杀掉mysql pkill mysqld; # 手工failover场景,master死掉,但是masterha_manager没有开启,可以通过手工failover: masterha_master_switch --conf=/etc/mha/app1.cnf --dead_master_host=192.168.247.130 --master_state=dead --new_master_host=192.168.247.120 --ignore_last_failover # 情况二的操作手法: # 手工在线切换,手动在线切换mha,切换时需要将在运行的mha停掉后才能切换。 # 方式一 masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --new_master_host=192.168.247.120 --orig_master_is_new_slave # 方式二 # --orig_master_is_new_slave切换时加上此参数是将原master变为slave节点,如果不加此参数,原来的master将不启动 #--running_updates_limit=10000 切换时候选master如果有延迟的话,mha切换不能成功,加上此参数表示延迟在此时间范围内都可切换(单位为s),但是切换的时间长短是由recover时relay日志的大小决定 masterha_master_switch --conf=/etc/mha/app1.cnf --master_state=alive --new_master_host=192.168.247.120 --orig_master_is_new_slave --running_updates_limit=10000