记录CDH5.10一个clients.NetworkClient: Bootstrap broker ip:9092 disconnected问题

简介: 1.当前环境使用的稳定版本组合a.本套环境CDH经过四次升级,当然版本为CDH-5.10.0-1.cdh5.10.0.p0.41b.KAFKA版本为KAFKA-2.

1.当前环境使用的稳定版本组合
a.本套环境CDH经过四次升级,当然版本为CDH-5.10.0-1.cdh5.10.0.p0.41
b.KAFKA版本为KAFKA-2.1.0-1.2.1.0.p0.115
c.SPARK2版本为SPARK2-2.0.0.cloudera1-1.cdh5.7.0.p0.113931




2.Spark2安装排查分析
你在Hosts-->Parcels页会发现,Spark2可以升级到该版本的release的2.0.0.cloudera2版本,即为2.0.0.cloudera2-1.cdh5.7.0.p0.118100,
但是我们在安装时,发现该版本的spark history启动报错,通过分析shell脚本stdout,stderr日志则报错为
The CSD version (2.0.0.cloudera1) is not compatible with the current Spark 2 version (2.0.0.cloudera2)

后来再分析一下,当前的CSD_VERSION为2.0.0.cloudera1,假如升级为最新版本,则SPARK2_VERSION为2.0.0.cloudera2,所以服务根本不可能启动,
尝试着在元数据库的表中将2.0.0.cloudera2改为2.0.0.cloudera1,但是web界面的parcel的该spark2的则立即显示不可用,这时真心感觉cloudera的厉害!

最后我选择和CSD_VERSION相同版本的SPARK2-2.0.0.cloudera1-1.cdh5.7.0.p0.113931


3.spark2_submit提交jar包到yarn上,实时spak从kafka中读取数据,但是检查job的log发现以下错误



4.分析错误,将程序的pom文件引用的版本全部替换为当前CDH,Kafka,Spark2的版本,再编译jar包
(其实假如编译廋包,就是没有依赖包,pom文件为Apache maven也行);

然后思考怀疑集群上的spark2的kafka jar包和CDH的kafka 版本不一致,
故将之前版本bak,然后cpoy 当前kafka的jar包到spark2的jars文件夹中(重点改这


4.1pom文件
img_e25d4fb2f8de1caf41a735ec53088516.pngpom.rar

4.2集群的每台都要进行如下操作

点击(此处)折叠或打开

  1. [root@sh-hadoop-01 ~]# /opt/cloudera/parcels/SPARK2/lib/spark2/jars/
  2. [root@sh-hadoop-01 jars]# ll
  3. ...............
  4. -rw-rw-r-- 1 root root 5001608 Dec 7 02:54 kafka_2.11-0.9.0-kafka-2.0.0.jar
  5. -rw-rw-r-- 1 root root 649382 Dec 7 02:54 kafka-clients-0.9.0-kafka-2.0.0.jar
  6. ..............
  7. [root@sh-hadoop-01 jars]# mv kafka_2.11-0.9.0-kafka-2.0.0.jar kafka_2.11-0.9.0-kafka-2.0.0.jar.bak
  8. [root@sh-hadoop-01 jars]# mv kafka-clients-0.9.0-kafka-2.0.0.jar kafka-clients-0.9.0-kafka-2.0.0.jar.bak
  9. [root@sh-hadoop-01 jars]# cd /opt/cloudera/parcels/KAFKA/lib/kafka/libs
  10. [root@sh-hadoop-01 libs]# cp /opt/cloudera/parcels/KAFKA/lib/kafka/libs/kafka_2.11-0.10.0-kafka-2.1.0.jar /opt/cloudera/parcels/SPARK2/lib/spark2/jars/
  11. [root@sh-hadoop-01 libs]# cp /opt/cloudera/parcels/KAFKA/lib/kafka/libs/kafka-clients-0.10.0-kafka-2.1.0.jar /opt/cloudera/parcels/SPARK2/lib/spark2/jars/
  12. [root@sh-hadoop-01 libs]# ll /opt/cloudera/parcels/SPARK2/lib/spark2/jars/
  13. ...............
  14. -rwxr-xr-x 1 root root 5156768 Mar 9 23:48 kafka_2.11-0.10.0-kafka-2.1.0.jar
  15. -rw-rw-r-- 1 root root 5001608 Dec 7 02:54 kafka_2.11-0.9.0-kafka-2.0.0.jar.bak
  16. -rwxr-xr-x 1 root root 747732 Mar 9 23:48 kafka-clients-0.10.0-kafka-2.1.0.jar
  17. -rw-rw-r-- 1 root root 649382 Dec 7 02:54 kafka-clients-0.9.0-kafka-2.0.0.jar.bak
  18. ...............


5.凌晨解决问题,重新提交jar,直到现在稳定运行10h

目录
相关文章
|
6月前
|
消息中间件
RabbitMQ ha-promote-on-shutdown 与 ha-promote-on-failure
RabbitMQ ha-promote-on-shutdown 与 ha-promote-on-failure
|
8天前
|
分布式计算 资源调度 Hadoop
Hadoop【问题记录 03】【ipc.Client: Retrying connect to server:xxx/:8032+InvalidResourceRequestException】解决
【4月更文挑战第2天】Hadoop【问题记录 03】【ipc.Client: Retrying connect to server:xxx/:8032+InvalidResourceRequestException】解决
71 2
|
8天前
|
消息中间件 Kafka
Kafka【问题 03】Connection to node -1 (/IP:9092) could not be established. Broker may not be available.
Kafka【问题 03】Connection to node -1 (/IP:9092) could not be established. Broker may not be available.
257 0
|
消息中间件 Kafka
kafka报错: (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
kafka报错: (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
kafka报错: (localhost/127.0.0.1:9092) could not be established. Broker may not be available.
|
8天前
|
消息中间件 Linux
mq报错abbit@syld36: * connected to epmd (port 4369) on syld36 * epmd reports node ‘rabbit‘ uses po
mq报错abbit@syld36: * connected to epmd (port 4369) on syld36 * epmd reports node ‘rabbit‘ uses po
19 0
|
8天前
|
消息中间件 Java Kafka
【Kafka】Kafka-Server-start.sh 启动脚本分析(Ver 2.7.2)
【Kafka】Kafka-Server-start.sh 启动脚本分析(Ver 2.7.2)
40 0
|
8天前
|
消息中间件 Kafka
[已解决]Unable to connect to broker 0
[已解决]Unable to connect to broker 0
93 0
|
8月前
|
消息中间件 缓存 Java
聊聊 Kafka:协调者 GroupCoordinator 源码剖析之 GROUP、OFFSET、HEARTBEAT 相关命令
聊聊 Kafka:协调者 GroupCoordinator 源码剖析之 GROUP、OFFSET、HEARTBEAT 相关命令
|
消息中间件 Perl
【Rabbitmq报错及解决办法】Error: unable to connect to node rabbit@rabbitmq3: nodedown
【Rabbitmq报错及解决办法】Error: unable to connect to node rabbit@rabbitmq3: nodedown
507 0
查询zookeeper端口号,报 “stat is not executed because it is not in the whitelist.”
查询zookeeper端口号,报 “stat is not executed because it is not in the whitelist.”
138 0

热门文章

最新文章