flink ha模式进程hang！！！_问答-阿里云开发者社区

各位朋友大家好，我是flink初学者，部署flink ha的过程中出现一些问题，麻烦大家帮忙看下；启动flink ha后，jobmanager进程直接hang，使用的flink 1.7.2版本，下面log中有一处出现此错误 File does not exist: /flink/ha/zookeeper/submittedJobGraphb05001535f91 ，让我不解的是我的checkpoint目录以及ha目录并不是这个，为什么会到这个目录去找，我所配置的目录下没有生成JobGraph ，他会一直去检索 /a5ffe00b0bc5688d9a7de5c62b8150e6 这个作业图而且找不到，我删除了所有相关的配置路径之后重新搭建，启动时还是会去检索，我该怎样避免flink去检索这个JobGraph ，让我的ha群集健康的运行起来。

报错日志： 2019-03-25 18:55:00,742 ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Fatal error occurred in the cluster entrypoint. java.lang.RuntimeException: org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph from state handle under /a5ffe00b0bc5688d9a7de5c62b8150e6. This indicates that the retrieved state handle is broken. Try cleaning the state handle store. at org.apache.flink.util.ExceptionUtils.rethrow(ExceptionUtils.java:199) at org.apache.flink.util.function.FunctionUtils.lambda$uncheckedFunction$2(FunctionUtils.java:74) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) ....... Caused by: org.apache.flink.util.FlinkException: Could not retrieve submitted JobGraph from state handle under /a5ffe00b0bc5688d9a7de5c62b8150e6. This indicates that the retrieved state handle is broken. Try cleaning the state handle store. at org.apache.flink.runtime.jobmanager.ZooKeeperSubmittedJobGraphStore.recoverJobGraph(ZooKeeperSubmittedJobGraphStore.java:208) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJob(Dispatcher.java:696) at org.apache.flink.runtime.dispatcher.Dispatcher.recoverJobGraphs(Dispatcher.java:681) ........ Caused by: java.io.FileNotFoundException: File does not exist: /flink/ha/zookeeper/submittedJobGraphb05001535f91 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2070) ....... Caused by: org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File does not exist: /flink/ha/zookeeper/submittedJobGraphb05001535f91 at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:66) at org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:56) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:2100) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2070) .......

谢谢！*来自志愿者整理的flink邮件归档

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

flink ha模式进程hang！！！

相关文章