环境:双M-M架构,其中一台B因为磁盘损坏,服务器异常重启。重启之后B上面的数据库正常运行,当时A 库报如下错误:
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position
root@rac1 # my 3306
Entry Port ==== 3306
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1612406693
Server version: 5.1.40-community-log MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
root@127.0.0.1 : (none) 20:55:50> show slave status \G
*************************** 1. row ***************************
Slave_IO_State:
Master_Host: 10.250.7.3
Master_User: replicator
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000143
Read_Master_Log_Pos: 664526789
Relay_Log_File: slave-relay.000339
Relay_Log_Pos: 251
Relay_Master_Log_File: mysql-bin.000143
Slave_IO_Running: No
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 664526789
Relay_Log_Space: 445
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 1236
Last_IO_Error: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position'
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
root@127.0.0.1 : (none) 20:55:52> exit
Bye
查看A库的error log日志 发现
root@rac1 # tail -f /home/mysql/data/mysql/master-error.log
120611 19:32:35 [Warning] Aborted connection 1341365540 to db: 'unconnected' user: 'replicator' host: '10.250.7.3' (Got timeout writing communication packets)
120611 20:31:28 [ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236)
120611 20:31:28 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position', Error_code: 1236
120611 20:31:28 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000143', position 664526789
120611 20:39:27 [Note] Error reading relay log event: slave SQL thread was killed
120611 20:39:38 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000143' at position 664526789, relay log '/home/mysql/data/mysql/slave-relay.000339' position: 251
120611 20:39:38 [Note] Slave I/O thread: connected to master 'replicator@10.250.7.3:3306',replication started in log 'mysql-bin.000143' at position 664526789
120611 20:39:38 [ERROR] Error reading packet from server: Client requested master to start replication from impossible position ( server_errno=1236) 上面的位置错误
120611 20:39:38 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Client requested master to start replication from impossible position', Error_code: 1236
120611 20:39:38 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000143', position 664526789
Slave_IO_Running线程终止。仔细看上面的报错信息,说slave进程试图从mysql-bin.000143日志的 position 664526789开始启动恢复,但是该日志中是没有此position。
根据主库A 上的错误信息
120611 20:39:38 [Note] Slave I/O thread: connected to master 'replicator@10.250.7.3:3306',replication started in log 'mysql-bin.000143' at position 664526789
到从库B 上查看其bin log记录
root@rac2 # mysqlbinlog mysql-bin.000143 > log_20120611.sql
root@rac2 #
root@rac2 #
root@rac2 #
root@rac2 # tail -f log_20120611.sql
SET TIMESTAMP=1339414288/*!*/;
insert into vm_monitor_20120611(time_stamp,name,group_id,user_id,cpu,memory,rx,tx,flow_volume,bandwidth, read_iops, write_iops, gmt_create) values(1339414287,'VM-3157068F','977','21',10.0304,12288,14533,31178,45711,1524,221.4,9.66667,'2012-06-11,19:31:28')
/*!*/;
# at 664521543
#120611 19:31:28 server id 2 end_log_pos 664521570 Xid = 546104615
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
回到主库A 上面进行重新指定应用日志的位置,查看master.info信息,从这里也可以看出 io thread 进行恢复的起始位置为不合理的664526789
root@rac1 #
root@rac1 # more master.info
15
mysql-bin.000143
664526789
10.250.7.3
replicator
xxxxxxxx
3306
60
0
0
root@rac1 # my 3306
Entry Port ==== 3306
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1612460302
Server version: 5.1.40-community-log MySQL Community Server (GPL)
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
root@127.0.0.1 : (none) 21:11:51> stop slave;
Query OK, 0 rows affected (0.00 sec)
这里可以指定mysql-bin.000144 的第一个post 也可以指定 mysql-bin.000143的664521543位置!
root@127.0.0.1 : (none) 21:12:27> CHANGE MASTER TO MASTER_HOST='10.250.7.3',
-> MASTER_USER ='replicator',
-> MASTER_PASSWORD ='xxxxx',
-> MASTER_LOG_FILE ='mysql-bin.000144',
-> MASTER_LOG_POS =4,
-> master_port =3306;
Query OK, 0 rows affected (0.16 sec)
root@127.0.0.1 : (none) 21:13:32> start slave;
Query OK, 0 rows affected (0.00 sec)
root@127.0.0.1 : (none) 21:13:38>
root@127.0.0.1 : (none) 21:13:38> show slave status \G
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 10.250.7.3
Master_User: replicator
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000144
Read_Master_Log_Pos: 148515312
Relay_Log_File: slave-relay.000002
Relay_Log_Pos: 251
Relay_Master_Log_File: mysql-bin.000144
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 148515312
Relay_Log_Space: 402
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
至此问题解决 ok!!