缘起
在RocketMQ客户端的DefaultMQPushConsumer的start方法被执行时,时不时会报出invokeSync call timeout异常,如下:
Caused by: java.lang.IllegalStateException: org.apache.rocketmq.remoting.exception.RemotingTimeoutException: invokeSync call timeout
at org.apache.rocketmq.client.impl.factory.MQClientInstance.updateTopicRouteInfoFromNameServer(MQClientInstance.java:679) ~[rocketmq-client-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.impl.factory.MQClientInstance.updateTopicRouteInfoFromNameServer(MQClientInstance.java:509) ~[rocketmq-client-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.impl.consumer.DefaultMQPushConsumerImpl.updateTopicSubscribeInfoWhenSubscriptionChanged(DefaultMQPushConsumerImpl.java:872) ~[rocketmq-client-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.impl.consumer.DefaultMQPushConsumerImpl.start(DefaultMQPushConsumerImpl.java:653) ~[rocketmq-client-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.consumer.DefaultMQPushConsumer.start(DefaultMQPushConsumer.java:698) ~[rocketmq-client-4.7.1.jar:4.7.1]
at cn.xdf.xcloud.rocketmq.support.DefaultRocketMQListenerContainer.start(DefaultRocketMQListenerContainer.java:276) ~[xcloud-rocketmq-core-1.2.0.RELEASE.jar:1.2.0.RELEASE]
at cn.xdf.xcloud.rocketmq.autoconfigure.ListenerContainerConfiguration.registerContainer(ListenerContainerConfiguration.java:103) ~[xcloud-rocketmq-core-1.2.0.RELEASE.jar:1.2.0.RELEASE]
... 12 common frames omitted
Caused by: org.apache.rocketmq.remoting.exception.RemotingTimeoutException: invokeSync call timeout
at org.apache.rocketmq.remoting.netty.NettyRemotingClient.invokeSync(NettyRemotingClient.java:375) ~[rocketmq-remoting-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.impl.MQClientAPIImpl.getTopicRouteInfoFromNameServer(MQClientAPIImpl.java:1363) ~[rocketmq-client-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.impl.MQClientAPIImpl.getTopicRouteInfoFromNameServer(MQClientAPIImpl.java:1353) ~[rocketmq-client-4.7.1.jar:4.7.1]
at org.apache.rocketmq.client.impl.factory.MQClientInstance.updateTopicRouteInfoFromNameServer(MQClientInstance.java:622) ~[rocketmq-client-4.7.1.jar:4.7.1]
... 18 common frames omitted
如果着急马上找到解决办法,可以直接跳到解决办法。不过,授人以鱼,不如授之以渔。还是建议把寻找解决办法的过程看完,第一:可以给你以后遇到类似问题提供解决思路;第二:虽然都报这个异常,但产生的原因可能不一样。
寻找解决办法之路
做为面向搜索引擎编程的一员,立马复制关键字invokeSync call timeout去搜索引擎,得到的解决办法总结起来有两点:
- RocketMQ客户端和服务端版本不一致,检查了一下客户端和服务端的版本,都是4.7.1。
- 降低RocketMQ客户端的版本,这个我时不能接受的。
搜索引擎无法解决,只能自己想办法了。首先找到报异常的地方:
public RemotingCommand invokeSync(String addr, final RemotingCommand request, long timeoutMillis)
throws InterruptedException, RemotingConnectException, RemotingSendRequestException, RemotingTimeoutException {
long beginStartTime = System.currentTimeMillis();
final Channel channel = this.getAndCreateChannel(addr);
if (channel != null && channel.isActive()) {
try {
doBeforeRpcHooks(addr, request);
long costTime = System.currentTimeMillis() - beginStartTime;
if (timeoutMillis < costTime) {
throw new RemotingTimeoutException("invokeSync call timeout");
}
RemotingCommand response = this.invokeSyncImpl(channel, request, timeoutMillis - costTime);
doAfterRpcHooks(RemotingHelper.parseChannelRemoteAddr(channel), request, response);
return response;
}
//省略部分无关代码
} else {
this.closeChannel(addr, channel);
throw new RemotingConnectException(addr);
}
}
原来是因为代码执行的时间过长,才报出了invokeSync call timeout异常。首先想到的是延长超时时间,继续分析源码,向上寻找调用方,发现在MQClientInstance的updateTopicRouteInfoFromNameServer方法中有:
topicRouteData = this.mQClientAPIImpl.getTopicRouteInfoFromNameServer(topic, 1000 * 3);
居然是写死了3秒,没有办法修改,我竟无语凝噎。
再向下一步一步地分析源码,到底是哪里慢?
org.apache.rocketmq.remoting.netty.NettyRemotingClient.getAndCreateChannel
org.apache.rocketmq.remoting.netty.NettyRemotingClient.getAndCreateNameserverChannel
org.apache.rocketmq.remoting.netty.NettyRemotingClient.createChannel
io.netty.bootstrap.Bootstrap.connect(java.net.SocketAddress)
io.netty.bootstrap.Bootstrap.doResolveAndConnect
io.netty.bootstrap.AbstractBootstrap.initAndRegister
io.netty.bootstrap.ChannelFactory.newChannel
io.netty.channel.socket.nio.NioSocketChannel.NioSocketChannel()
io.netty.channel.nio.AbstractNioChannel.AbstractNioChannel
io.netty.channel.AbstractChannel.AbstractChannel(io.netty.channel.Channel)
io.netty.channel.AbstractChannel.newId
io.netty.channel.DefaultChannelId.newInstance
最终找到了:
public static DefaultChannelId newInstance() {
return new DefaultChannelId();
}
在创建DefaultChannelId的实例时,执行了这个类的静态代码块,就是这段静态代码块比较耗时。
那么,解决办法就有了,提前加载DefaultChannelId类,使其静态代码块先执行完成。
解决办法
在调用DefaultMQPushConsumer的start方法之前,插入如下代码:
DefaultChannelId.newInstance();