如果节点的网络短暂(如10秒 - 1分钟)与源数据库的机器连不上,就会重试3次连接网络,然后Kill掉连接。接着网络连接上了,但报错:Could not find first log file name in binary log index file,报3次这样的错后,又Kill掉连接,如此kill掉3次后,最后不停地报:NullPointerException。最后只用停止Channel,删除最后同步位置信息,然后设置canal的“位点自定义设置”,最后启动 Channel 才恢复正常。这种异常非常常见,隔一段时间就发生。使用命令:show binlog events in 'bin-log.00000x' from xxx; 查日志又是存在的。
背景:1个数据源,6个异构的Node节点分布在不同的内网机器,同步3张表,每张表数量在1W左右。上面的异常有2个节点发生了,报错的日志几乎是一模一样的,其它4个正常。
开始的3次网络连接不上日志:
2016-09-05 08:12:45.180 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] ERROR c.a.o.canal.parse.inbound.mysql.dbsync.DirectLogFetcher - I/O error while reading from client socket java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_79] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_79] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_79] at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_79] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[na:1.7.0_79] at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch0(DirectLogFetcher.java:154) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:70) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:113) [canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:210) [canal.parse-1.0.22.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 2016-09-05 08:12:45.181 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] ERROR c.a.otter.canal.parse.inbound.mysql.MysqlEventParser - dump address /172.30.248.148:3306 has an error, retrying. caused by java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[na:1.7.0_79] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[na:1.7.0_79] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[na:1.7.0_79] at sun.nio.ch.IOUtil.read(IOUtil.java:197) ~[na:1.7.0_79] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) ~[na:1.7.0_79] at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch0(DirectLogFetcher.java:154) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:70) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:113) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:210) ~[canal.parse-1.0.22.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 2016-09-05 08:12:45.183 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] WARN c.a.o.s.a.i.setl.zookeeper.termin.WarningTerminProcess - nid:8[8:canal:canal-xmg:java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) at sun.nio.ch.IOUtil.read(IOUtil.java:197) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:379) at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch0(DirectLogFetcher.java:154) at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:70) at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:113) at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:210) at java.lang.Thread.run(Thread.java:745) ] 2016-09-05 08:12:45.185 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] ERROR c.a.otter.canal.parse.inbound.mysql.MysqlEventParser - disconnect address /172.30.248.148:3306 has an error, retrying., caused by java.io.IOException: KILL DUMP 4408478 failure:java.io.IOException: ErrorPacket [errorNumber=1094, fieldCount=-1, message=Unknown thread id: 4408478, sqlState=HY000, sqlStateMarker=#] with command: KILL CONNECTION 4408478 at com.alibaba.otter.canal.parse.driver.mysql.MysqlUpdateExecutor.update(MysqlUpdateExecutor.java:49) at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.disconnect(MysqlConnector.java:104) at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.disconnect(MysqlConnection.java:60) at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:242) at java.lang.Thread.run(Thread.java:745)
at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.disconnect(MysqlConnector.java:106) ~[canal.parse.driver-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.disconnect(MysqlConnection.java:60) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:242) ~[canal.parse-1.0.22.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
网络连接成功后,就重试了3次查找Bin-log位置失败和杀死连接:
2016-09-05 08:12:58.989 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] WARN c.a.otter.canal.parse.inbound.mysql.MysqlEventParser - prepare to find start position just last position 2016-09-05 08:12:58.993 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] ERROR c.a.o.canal.parse.inbound.mysql.dbsync.DirectLogFetcher - I/O error while reading from client socket java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:95) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:113) [canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:210) [canal.parse-1.0.22.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79] 2016-09-05 08:12:58.993 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] ERROR c.a.otter.canal.parse.inbound.mysql.MysqlEventParser - dump address /172.30.248.148:3306 has an error, retrying. caused by java.io.IOException: Received error packet: errno = 1236, sqlstate = HY000 errmsg = Could not find first log file name in binary log index file at com.alibaba.otter.canal.parse.inbound.mysql.dbsync.DirectLogFetcher.fetch(DirectLogFetcher.java:95) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.dump(MysqlConnection.java:113) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:210) ~[canal.parse-1.0.22.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
2016-09-05 08:12:58.995 [destination = canal-xmg , address = /172.30.248.148:3306 , EventParser] ERROR c.a.otter.canal.parse.inbound.mysql.MysqlEventParser - disconnect address /172.30.248.148:3306 has an error, retrying., caused by java.io.IOException: KILL DUMP 4657772 failure:java.io.IOException: ErrorPacket [errorNumber=1094, fieldCount=-1, message=Unknown thread id: 4657772, sqlState=HY000, sqlStateMarker=#] with command: KILL CONNECTION 4657772 at com.alibaba.otter.canal.parse.driver.mysql.MysqlUpdateExecutor.update(MysqlUpdateExecutor.java:49) at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.disconnect(MysqlConnector.java:104) at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.disconnect(MysqlConnection.java:60) at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:242) at java.lang.Thread.run(Thread.java:745)
at com.alibaba.otter.canal.parse.driver.mysql.MysqlConnector.disconnect(MysqlConnector.java:106) ~[canal.parse.driver-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.mysql.MysqlConnection.disconnect(MysqlConnection.java:60) ~[canal.parse-1.0.22.jar:na] at com.alibaba.otter.canal.parse.inbound.AbstractEventParser$3.run(AbstractEventParser.java:242) ~[canal.parse-1.0.22.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_79]
原提问者GitHub用户 dusonchen
alibaba/canal@c99b4b8 修复一个NPE问题
https://github.com/alibaba/canal/commit/c99b4b8cd1cae7613aea9491e72762e6c879cdcf
原回答者GitHub用户agapple
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。