引言
在SOA
分布式服务体系架构中,注册中心担任了服务注册以及服务调用解析的任务。那么在RocketMQ
中,NameServer
则负责了类似的职责。它是整个RocketMQ
体系中的中枢系统,负责体系中的消息调度以及控制。
- 路由管理
- 路由信息
- 路由注册
- 路由删除
- 总结
一、路由管理
在介绍NameServer
的工作任务流程之前,我们先一起来看下RocketMQ
物理部署的架构图,如下所示:
(图片来自于网络)
在RocketMQ物理部署结构中,Broker消息服务器在启动时会向所有的NameServer服务器进行注册操作。此后消息发送者(Producer)将在发送消息之前从NameServer获取到Broker消息服务器的地址列表。NameServer与Broker消息服务器保持着长连接,同时每次间隔30s做Broker的存活检测。如果发现Broker依然宕机,就会从路由注册表中将该Broker进行删除。此结构中,NameServer服务之间不互相通信。但是所有的Broker、Producer以及Consumer都需要与NameServer进行通信。
在NamesrvStartup类中进行NameServer的启动操作,源码如下所示:
public static NamesrvController main0(String[] args) { try { //创建NamesrvController NamesrvController controller = createNamesrvController(args); //启动NamesrvController start(controller); String tip = "The Name Server boot success. serializeType=" + RemotingCommand.getSerializeTypeConfigInThisServer(); log.info(tip); System.out.printf("%s%n", tip); return controller; } catch (Throwable e) { e.printStackTrace(); System.exit(-1); } return null; }
创建NamesrvController
实例,源码如下所示:
public static NamesrvController createNamesrvController(String[] args) throws IOException, JoranException { System.setProperty(RemotingCommand.REMOTING_VERSION_KEY, Integer.toString(MQVersion.CURRENT_VERSION)); //PackageConflictDetect.detectFastjson(); //读取启动参数 Options options = ServerUtil.buildCommandlineOptions(new Options()); commandLine = ServerUtil.parseCmdLine("mqnamesrv", args, buildCommandlineOptions(options), new PosixParser()); if (null == commandLine) { System.exit(-1); return null; } //NamesrvConfig为NameServer配置,NettyServerConfig为Netty配置,用来构造NamesrvController final NamesrvConfig namesrvConfig = new NamesrvConfig(); final NettyServerConfig nettyServerConfig = new NettyServerConfig(); nettyServerConfig.setListenPort(9876); if (commandLine.hasOption('c')) { String file = commandLine.getOptionValue('c'); if (file != null) { InputStream in = new BufferedInputStream(new FileInputStream(file)); properties = new Properties(); properties.load(in); MixAll.properties2Object(properties, namesrvConfig); MixAll.properties2Object(properties, nettyServerConfig); namesrvConfig.setConfigStorePath(file); System.out.printf("load config properties file OK, %s%n", file); in.close(); } } if (commandLine.hasOption('p')) { InternalLogger console = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_CONSOLE_NAME); MixAll.printObjectProperties(console, namesrvConfig); MixAll.printObjectProperties(console, nettyServerConfig); System.exit(0); } MixAll.properties2Object(ServerUtil.commandLine2Properties(commandLine), namesrvConfig); if (null == namesrvConfig.getRocketmqHome()) { System.out.printf("Please set the %s variable in your environment to match the location of the RocketMQ installation%n", MixAll.ROCKETMQ_HOME_ENV); System.exit(-2); } LoggerContext lc = (LoggerContext) LoggerFactory.getILoggerFactory(); JoranConfigurator configurator = new JoranConfigurator(); configurator.setContext(lc); lc.reset(); configurator.doConfigure(namesrvConfig.getRocketmqHome() + "/conf/logback_namesrv.xml"); log = InternalLoggerFactory.getLogger(LoggerName.NAMESRV_LOGGER_NAME); MixAll.printObjectProperties(log, namesrvConfig); MixAll.printObjectProperties(log, nettyServerConfig); final NamesrvController controller = new NamesrvController(namesrvConfig, nettyServerConfig); // remember all configs to prevent discard controller.getConfiguration().registerConfig(properties); return controller; }
启动NamesrvController
源码如下所示:
public static NamesrvController start(final NamesrvController controller) throws Exception { if (null == controller) { throw new IllegalArgumentException("NamesrvController is null"); } //初始化NamesrvController实例 boolean initResult = controller.initialize(); if (!initResult) { controller.shutdown(); System.exit(-3); } //注册JVM钩子函数,同时启动服务器,用于监听Broker、消息生产者的网络请求 Runtime.getRuntime().addShutdownHook(new ShutdownHookThread(log, new Callable<Void>() { @Override public Void call() throws Exception { controller.shutdown(); return null; } })); //启动 controller.start(); return controller; }
NamesrvController
初始化操作如下所示:
public boolean initialize() { this.kvConfigManager.load(); //初始化nettyServer this.remotingServer = new NettyRemotingServer(this.nettyServerConfig, this.brokerHousekeepingService); //初始化线程池 this.remotingExecutor = Executors.newFixedThreadPool(nettyServerConfig.getServerWorkerThreads(), new ThreadFactoryImpl("RemotingExecutorThread_")); //注册请求处理 this.registerProcessor(); this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() { @Override public void run() { NamesrvController.this.routeInfoManager.scanNotActiveBroker(); } }, 5, 10, TimeUnit.SECONDS); this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() { @Override public void run() { NamesrvController.this.kvConfigManager.printAllPeriodically(); } }, 1, 10, TimeUnit.MINUTES); if (TlsSystemConfig.tlsMode != TlsMode.DISABLED) { // Register a listener to reload SslContext try { fileWatchService = new FileWatchService( new String[] { TlsSystemConfig.tlsServerCertPath, TlsSystemConfig.tlsServerKeyPath, TlsSystemConfig.tlsServerTrustCertPath }, new FileWatchService.Listener() { boolean certChanged, keyChanged = false; @Override public void onChanged(String path) { if (path.equals(TlsSystemConfig.tlsServerTrustCertPath)) { log.info("The trust certificate changed, reload the ssl context"); reloadServerSslContext(); } if (path.equals(TlsSystemConfig.tlsServerCertPath)) { certChanged = true; } if (path.equals(TlsSystemConfig.tlsServerKeyPath)) { keyChanged = true; } if (certChanged && keyChanged) { log.info("The certificate and private key changed, reload the ssl context"); certChanged = keyChanged = false; reloadServerSslContext(); } } private void reloadServerSslContext() { ((NettyRemotingServer) remotingServer).loadSslContext(); } }); } catch (Exception e) { log.warn("FileWatchService created error, can't load the certificate dynamically"); } } return true; }
二、路由信息
NameServer
主要是为了生产者以及消费者提供Topic
的路由信息。那么路由信息是存在哪里呢?在RouteInfoManager
源码中,可以查看到如下信息:
private final HashMap<String/* topic */, List<QueueData>> topicQueueTable; private final HashMap<String/* brokerName */, BrokerData> brokerAddrTable; private final HashMap<String/* clusterName */, Set<String/* brokerName */>> clusterAddrTable; private final HashMap<String/* brokerAddr */, BrokerLiveInfo> brokerLiveTable; private final HashMap<String/* brokerAddr */, List<String>/* Filter Server */> filterServerTable;
1、 topicQueueTable
Topic
消息队列路由信息,存储所有Topic
的属性信息,消息发送时会根据路由表进行负载均衡;
它是一个HashMap
的数据结构,对应的key
值为Topic
的名称,value
为List<QueueData>
,QueueData
信息如下所示:
public class QueueData implements Comparable<QueueData> { //Broker名称 private String brokerName; //读队列长度 private int readQueueNums; //写队列长度 private int writeQueueNums; //读写去权限 private int perm; //Topic同步标志 private int topicSynFlag; ... }
2、 brokerAddrTable
存储Broker
的属性信息,包括Broker
基础信息,包含brokerName
、所属集群名称,主备Broker
地址;
对应的key
只为BrokerName
,value为BrokerData
。BrokerData
的相关属性如下所示:
public class BrokerData implements Comparable<BrokerData> { private String cluster; private String brokerName; private HashMap<Long/* brokerId */, String/* broker address */> brokerAddrs; ... }
相同名称的Broker可能有多台机器,也就是说它可能是个集群,一个Master以及多个Slave。所以在BrokerData中,存储了相关的属性,即Broker的名称、所属集群的信息,以及对应的机器的地址信息。
3、clusterAddrTable
Broker集群信息,存储集群重点 所有Broker的名称;
key为集群的名称,value为Broker名称的集合。
4、brokerLiveTable
Broker及其的实时状态信息,NameServer每次收到心跳探测包会进行状态信息更新;
key值为 Broker的地址,对应一台机器,value为BrokerLiveInfo,它保存着对应Broker服务器的地址信息。lastUpdateTimestamp表示上次的状态更新时间,NameServer定时检查这个时间出的实时性,如果发现这个时间戳超过一定时间没有进行更新,则会将该Broker的地址从列表中进行删除。
class BrokerLiveInfo { //最后更新时间 private long lastUpdateTimestamp; //数据版本号 private DataVersion dataVersion; //连接信息 private Channel channel; //服务器地址 private String haServerAddr; ... }
5、filterServerTable
存储过滤服务器信息,Broker
上的FilterServer
列表,用于类模式消息过滤。对应的key
值为Broker
地址,value
为FilterServer
的地址列表。
三、路由注册
在第二小节中,我们介绍了路由信息的相关内容。那么在RocketMQ体系中,路由注册时如何实现的呢?路由注册通过Broker与NameServer之间心跳连接进行的。当Broker服务启动之后,就会向集群中的所有的NameServer发送心跳探测包。正常运行之后,Broker会每隔三十秒向集群中的所有NameServer发送心跳检测包。当接收到心跳检测包时,NameServer会更新brokerLiveTable缓存中的BrokerLiveInfo的lastUpdateTimestamp时间戳。与此同时,NameServer会每隔10s对brokerLiveTable进行扫描,如果发现连续120s未检测到心跳,则会将该Broker路由信息进行移除,流程如下所示:
在BrokerController
中,通过调用它的start()
方法,进行Broker端的心跳包发送,逻辑代码如下所示:
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() { @Override public void run() { try { //注册所有Broker BrokerController.this.registerBrokerAll(true, false, brokerConfig.isForceRegister()); } catch (Throwable e) { log.error("registerBrokerAll Exception", e); } } }, 1000 * 10, Math.max(10000, Math.min(brokerConfig.getRegisterNameServerPeriod(), 60000)), TimeUnit.MILLISECONDS);
在registerBrokerAll
方法中调用了doRegisterBrokerAll
方法,如下所示:
private void doRegisterBrokerAll(boolean checkOrderConfig, boolean oneway, TopicConfigSerializeWrapper topicConfigWrapper) { List<RegisterBrokerResult> registerBrokerResultList = this.brokerOuterAPI.registerBrokerAll( this.brokerConfig.getBrokerClusterName(), this.getBrokerAddr(), this.brokerConfig.getBrokerName(), this.brokerConfig.getBrokerId(), this.getHAServerAddr(), topicConfigWrapper, this.filterServerManager.buildNewFilterServerList(), oneway, this.brokerConfig.getRegisterBrokerTimeoutMills(), this.brokerConfig.isCompressedRegister()); if (registerBrokerResultList.size() > 0) { RegisterBrokerResult registerBrokerResult = registerBrokerResultList.get(0); if (registerBrokerResult != null) { if (this.updateMasterHAServerAddrPeriodically && registerBrokerResult.getHaServerAddr() != null) { this.messageStore.updateHaMasterAddress(registerBrokerResult.getHaServerAddr()); } this.slaveSynchronize.setMasterAddr(registerBrokerResult.getMasterAddr()); if (checkOrderConfig) { this.getTopicConfigManager().updateOrderTopicConfig(registerBrokerResult.getKvTable()); } } } }
在BrokerOuterAPI
中调用registerBrokerAll
方法获取所有注册的Broker
列表:
public List<RegisterBrokerResult> registerBrokerAll( //集群名称 final String clusterName, //broker地址 final String brokerAddr, //broker名称 final String brokerName, //broker ID final long brokerId, //master地址 final String haServerAddr, final TopicConfigSerializeWrapper topicConfigWrapper, final List<String> filterServerList, final boolean oneway, final int timeoutMills, final boolean compressed) { final List<RegisterBrokerResult> registerBrokerResultList = Lists.newArrayList(); List<String> nameServerAddressList = this.remotingClient.getNameServerAddressList(); if (nameServerAddressList != null && nameServerAddressList.size() > 0) { final RegisterBrokerRequestHeader requestHeader = new RegisterBrokerRequestHeader(); requestHeader.setBrokerAddr(brokerAddr); requestHeader.setBrokerId(brokerId); requestHeader.setBrokerName(brokerName); requestHeader.setClusterName(clusterName); requestHeader.setHaServerAddr(haServerAddr); requestHeader.setCompressed(compressed); RegisterBrokerBody requestBody = new RegisterBrokerBody(); requestBody.setTopicConfigSerializeWrapper(topicConfigWrapper); requestBody.setFilterServerList(filterServerList); final byte[] body = requestBody.encode(compressed); final int bodyCrc32 = UtilAll.crc32(body); requestHeader.setBodyCrc32(bodyCrc32); final CountDownLatch countDownLatch = new CountDownLatch(nameServerAddressList.size()); //遍历所有NameServer列表 for (final String namesrvAddr : nameServerAddressList) { brokerOuterExecutor.execute(new Runnable() { @Override public void run() { try { //分别向NameServer进行注册 RegisterBrokerResult result = registerBroker(namesrvAddr,oneway, timeoutMills,requestHeader,body); if (result != null) { registerBrokerResultList.add(result); } log.info("register broker to name server {} OK", namesrvAddr); } catch (Exception e) { log.warn("registerBroker Exception, {}", namesrvAddr, e); } finally { countDownLatch.countDown(); } } }); } try { countDownLatch.await(timeoutMills, TimeUnit.MILLISECONDS); } catch (InterruptedException e) { } } return registerBrokerResultList; }
四、路由删除
NameServer每隔10s会对brokerLiveTable状态表进行扫描,扫描时如果发现BrokerLive中的lastUpdateTimestamp的时间戳信息与当前时间相差超过120s,则认为此时的Broker已经失活了,需要将其进行移除操作,同时还需要关闭对应的连接。接下来我们看下具体的代码:
在进行NamesrvCntroller示例初始化时,会进行定时检测任务。
this.scheduledExecutorService.scheduleAtFixedRate(new Runnable() { @Override public void run() { NamesrvController.this.routeInfoManager.scanNotActiveBroker(); } }, 5, 10, TimeUnit.SECONDS);
在RouteInfoManager中的scanNotActiveBroker()方法,主要用于遍历brokerLiveInfo路由表信息,通过检测BrokerLiveInfo中的lastUpdateTimestamp字段,该时间戳信息为上次收到心跳检测包的时间,判断该时间信息与当前的时间之差是否超过了120s。
public void scanNotActiveBroker() { //获取在线Broker列表 Iterator<Entry<String, BrokerLiveInfo>> it = this.brokerLiveTable.entrySet().iterator(); while (it.hasNext()) { Entry<String, BrokerLiveInfo> next = it.next(); //获取对应的接收到心跳包的更新时间 long last = next.getValue().getLastUpdateTimestamp(); if ((last + BROKER_CHANNEL_EXPIRED_TIME) < System.currentTimeMillis()) { RemotingUtil.closeChannel(next.getValue().getChannel()); it.remove(); log.warn("The broker channel expired, {} {}ms", next.getKey(), BROKER_CHANNEL_EXPIRED_TIME); this.onChannelDestroy(next.getKey(), next.getValue().getChannel()); } } }
在onChannelDestroy方法中进行路由表的相关删除操作,如下所示:
public void onChannelDestroy(String remoteAddr, Channel channel) { String brokerAddrFound = null; if (channel != null) { try { try { this.lock.readLock().lockInterruptibly(); Iterator<Entry<String, BrokerLiveInfo>> itBrokerLiveTable = this.brokerLiveTable.entrySet().iterator(); while (itBrokerLiveTable.hasNext()) { Entry<String, BrokerLiveInfo> entry = itBrokerLiveTable.next(); if (entry.getValue().getChannel() == channel) { brokerAddrFound = entry.getKey(); break; } } } finally { this.lock.readLock().unlock(); } } catch (Exception e) { log.error("onChannelDestroy Exception", e); } } if (null == brokerAddrFound) { brokerAddrFound = remoteAddr; } else { log.info("the broker's channel destroyed, {}, clean it's data structure at once", brokerAddrFound); } if (brokerAddrFound != null && brokerAddrFound.length() > 0) { try { try { //申请写锁 this.lock.writeLock().lockInterruptibly(); //移除信息 this.brokerLiveTable.remove(brokerAddrFound); this.filterServerTable.remove(brokerAddrFound); String brokerNameFound = null; boolean removeBrokerName = false; Iterator<Entry<String, BrokerData>> itBrokerAddrTable = this.brokerAddrTable.entrySet().iterator(); while (itBrokerAddrTable.hasNext() && (null == brokerNameFound)) { BrokerData brokerData = itBrokerAddrTable.next().getValue(); Iterator<Entry<Long, String>> it = brokerData.getBrokerAddrs().entrySet().iterator(); while (it.hasNext()) { Entry<Long, String> entry = it.next(); Long brokerId = entry.getKey(); String brokerAddr = entry.getValue(); if (brokerAddr.equals(brokerAddrFound)) { brokerNameFound = brokerData.getBrokerName(); it.remove(); log.info("remove brokerAddr[{}, {}] from brokerAddrTable, because channel destroyed", brokerId, brokerAddr); break; } } if (brokerData.getBrokerAddrs().isEmpty()) { removeBrokerName = true; itBrokerAddrTable.remove(); log.info("remove brokerName[{}] from brokerAddrTable, because channel destroyed", brokerData.getBrokerName()); } } //对brokerAddrTable进行维护 if (brokerNameFound != null && removeBrokerName) { Iterator<Entry<String, Set<String>>> it = this.clusterAddrTable.entrySet().iterator(); while (it.hasNext()) { Entry<String, Set<String>> entry = it.next(); String clusterName = entry.getKey(); Set<String> brokerNames = entry.getValue(); //移除brokerName boolean removed = brokerNames.remove(brokerNameFound); if (removed) { log.info("remove brokerName[{}], clusterName[{}] from clusterAddrTable, because channel destroyed", brokerNameFound, clusterName); if (brokerNames.isEmpty()) { log.info("remove the clusterName[{}] from clusterAddrTable, because channel destroyed and no broker in this cluster", clusterName); it.remove(); } break; } } } //从clusterAddrTable中找到Broker从集群中进行删除 if (removeBrokerName) { Iterator<Entry<String, List<QueueData>>> itTopicQueueTable = this.topicQueueTable.entrySet().iterator(); while (itTopicQueueTable.hasNext()) { Entry<String, List<QueueData>> entry = itTopicQueueTable.next(); String topic = entry.getKey(); List<QueueData> queueDataList = entry.getValue(); Iterator<QueueData> itQueueData = queueDataList.iterator(); while (itQueueData.hasNext()) { QueueData queueData = itQueueData.next(); if (queueData.getBrokerName().equals(brokerNameFound)) { //移除Topic信息 itQueueData.remove(); log.info("remove topic[{} {}], from topicQueueTable, because channel destroyed", topic, queueData); } } if (queueDataList.isEmpty()) { itTopicQueueTable.remove(); log.info("remove topic[{}] all queue, from topicQueueTable, because channel destroyed", topic); } } } } finally { //释放锁 this.lock.writeLock().unlock(); } } catch (Exception e) { log.error("onChannelDestroy Exception", e); } } }
五、总结
本文主要阐述了NameServer
相关的路由管理的内容,路由管理是RocketMQ
系统中进行消息发送以及消费的重要前提。对于该部分的理解有助于我们更加深刻