8月12日晚8点40:
碰到了1.30 3306服务器宕机的问题:
分析原因:是sdb的硬盘出现了问题(根据fdisk -l 查看原因情况),加之挂载的是mysql的数据目录,导致1.30 3306的宕机
系统层面的故障解决方法:重启操作系统,插拔了sdb的对应的硬盘。
数据层面的操作:重启1.30 3306和3307数据库。
查看1.30 3307与3306的同步问题(操作方式:show master status,show slave status)
继续查看1.31的3306和3307的同步问题。发现同步出现问题(Got fatal error 1236 from master when reading data from binary log)
查看从库状态:
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.1.31
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: forummysql01-bin.004799
Read_Master_Log_Pos: 73959657
Relay_Log_File: mysql-relay-bin.001687
Relay_Log_Pos: 73959809
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position'
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
ERROR:
No query specified
发现从库读了一个不可预知的pos的日志。
再次,回到主库查看主库的binlog:
mysql> show master status\G;
*************************** 1. row ***************************
File: forummysql01-bin.004800
Position: 29732172
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
ERROR:
No query specified
分析forummysql01-bin.004799的binlog。(mysqlbinlog ./forummysql01-bin.004799 >/root/bin_004799.log)
发现binlog里面的位置,出现偏移,bin.004799 里面没有pos 73959657位置的日志
回到从库,同时修改两台从库binlog的位置。
stop slave;
CHANGE MASTER TO MASTER_HOST = '192.168.1.31',MASTER_PORT = 3306,MASTER_LOG_FILE = 'forummysql01-bin.004799',MASTER_LOG_POS = 73880572;
start slave;
在查看slave的状态:show slave status\G;
发现还有一个错误:
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.31
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: forummysql01-bin.004800
Read_Master_Log_Pos: 32884198
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 16724
Relay_Master_Log_File: forummysql01-bin.004800
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Error 'Duplicate entry '663961' for key 'PRIMARY'' on query. Default database: 'kaiyuan'. Query: 'insert into wiki_download(download_name,download_ip,download_refer,download_time) values ('HDWiki-v5.1GBK-20120509.zip','192.168.1.41','','2012-08-12 21:36:25')'
再次执行skip操作:
stop slave;
set global sql_slave_skip_counter =1 ;
start slave;
之后,就一切正常。
碰到了1.30 3306服务器宕机的问题:
分析原因:是sdb的硬盘出现了问题(根据fdisk -l 查看原因情况),加之挂载的是mysql的数据目录,导致1.30 3306的宕机
系统层面的故障解决方法:重启操作系统,插拔了sdb的对应的硬盘。
数据层面的操作:重启1.30 3306和3307数据库。
查看1.30 3307与3306的同步问题(操作方式:show master status,show slave status)
继续查看1.31的3306和3307的同步问题。发现同步出现问题(Got fatal error 1236 from master when reading data from binary log)
查看从库状态:
mysql> show slave status\G;
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 192.168.1.31
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: forummysql01-bin.004799
Read_Master_Log_Pos: 73959657
Relay_Log_File: mysql-relay-bin.001687
Relay_Log_Pos: 73959809
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position'
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
ERROR:
No query specified
发现从库读了一个不可预知的pos的日志。
再次,回到主库查看主库的binlog:
mysql> show master status\G;
*************************** 1. row ***************************
File: forummysql01-bin.004800
Position: 29732172
Binlog_Do_DB:
Binlog_Ignore_DB:
1 row in set (0.00 sec)
ERROR:
No query specified
分析forummysql01-bin.004799的binlog。(mysqlbinlog ./forummysql01-bin.004799 >/root/bin_004799.log)
发现binlog里面的位置,出现偏移,bin.004799 里面没有pos 73959657位置的日志
回到从库,同时修改两台从库binlog的位置。
stop slave;
CHANGE MASTER TO MASTER_HOST = '192.168.1.31',MASTER_PORT = 3306,MASTER_LOG_FILE = 'forummysql01-bin.004799',MASTER_LOG_POS = 73880572;
start slave;
在查看slave的状态:show slave status\G;
发现还有一个错误:
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.1.31
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: forummysql01-bin.004800
Read_Master_Log_Pos: 32884198
Relay_Log_File: mysql-relay-bin.000002
Relay_Log_Pos: 16724
Relay_Master_Log_File: forummysql01-bin.004800
Slave_IO_Running: Yes
Slave_SQL_Running: No
Last_Error: Error 'Duplicate entry '663961' for key 'PRIMARY'' on query. Default database: 'kaiyuan'. Query: 'insert into wiki_download(download_name,download_ip,download_refer,download_time) values ('HDWiki-v5.1GBK-20120509.zip','192.168.1.41','','2012-08-12 21:36:25')'
再次执行skip操作:
stop slave;
set global sql_slave_skip_counter =1 ;
start slave;
之后,就一切正常。