开发者社区> 问答> 正文

Flink 1.11 submit job timed out

使用版本Flink 1.11,部署方式 kubernetes session。 TM个数30个,每个TM 4个slot。 job 并行度120.提交作业的时候出现大量的No hostname could be resolved for the IP address,JM time out,作业提交失败。web ui也会卡主无响应。

用wordCount,并行度只有1提交也会刷,no hostname的日志会刷个几条,然后正常提交,如果并行度一上去,就会超时。

部分日志如下:

2020-07-15 16:58:46,460 WARN org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname could be resolved for the IP address 10.32.160.7, using IP address as host name. Local input split assignment (such as for HDFS files) may be impacted. 2020-07-15 16:58:46,460 WARN org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname could be resolved for the IP address 10.44.224.7, using IP address as host name. Local input split assignment (such as for HDFS files) may be impacted. 2020-07-15 16:58:46,461 WARN org.apache.flink.runtime.taskmanager.TaskManagerLocation [] - No hostname could be resolved for the IP address 10.40.32.9, using IP address as host name. Local input split assignment (such as for HDFS files) may be impacted.

2020-07-15 16:59:10,236 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - The heartbeat of JobManager with id 69a0d460de468888a9f41c770d963c0a timed out. 2020-07-15 16:59:10,236 INFO org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] - Disconnect job manager 00000000000000000000000000000000@akka.tcp://flink@flink-jobmanager:6123/user/rpc/jobmanager_2 for job e1554c737e37ed79688a15c746b6e9ef from the resource manager.

how to deal with ?*来自志愿者整理的flink邮件归档

展开
收起
说了是一只鲳鱼 2021-12-07 11:11:48 1019 0
1 条回答
写回答
取消 提交回答
  • 个人之前有遇到过 类似 的host解析问题,可以从k8s的pod节点网络映射角度排查一下。 希望这对你有帮助。*来自志愿者整理的flink邮件归档

    2021-12-07 11:27:57
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
Flink CDC Meetup PPT - 龚中强 立即下载
Flink CDC Meetup PPT - 王赫 立即下载
Flink CDC Meetup PPT - 覃立辉 立即下载