开发者社区> 问答> 正文

Flink 1.11.2 on yarn报错如何处理?

【环境】 Flink 版本:1.11.2 Hadoop 版本 :2.6.0-cdh5.8.3 Java 版本: 1.8.0_144

【命令】 [jacob@localhost flink-1.11.2]$ ./bin/yarn-session.sh -jm 1024m -tm 2048m 【现象】 .... 2020-12-08 18:06:00,134 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli [] - Error while running the Flink session. org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:382) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:514) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:751) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at java.security.AccessController.doPrivileged(Native Method) ~[?:?] at javax.security.auth.Subject.doAs(Subject.java:423) ~[?:?] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) ~[hadoop-common-2.6.0-cdh5.8.3.jar:?] at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:751) [flink-dist_2.11-1.11.2.jar:1.11.2] Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1603495749855_54023 failed 1 times due to AM Container for appattempt_1603495749855_54023_000001 exited with exitCode: 1 For more detailed output, check application tracking page:http://*******:8088/proxy/application_1603495749855_54023/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1603495749855_54023_01_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. If log aggregation is enabled on your cluster, use this command to further investigate the issue: yarn logs -applicationId application_1603495749855_54023 at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:1021) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:524) ~[flink-dist_2.11-1.11.2.jar:1.11.2] at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:375) ~[flink-dist_2.11-1.11.2.jar:1.11.2] ... 7 more


The program finished with the following exception:

org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:382) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:514) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:751) at java.base/java.security.AccessController.doPrivileged(Native Method) at java.base/javax.security.auth.Subject.doAs(Subject.java:423) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:751) Caused by: org.apache.flink.yarn.YarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment. Diagnostics from YARN: Application application_1603495749855_54023 failed 1 times due to AM Container for appattempt_1603495749855_54023_000001 exited with exitCode: 1 For more detailed output, check application tracking page:http://*******:8088/proxy/application_1603495749855_54023/Then, click on links to logs of each attempt. Diagnostics: Exception from container-launch. Container id: container_1603495749855_54023_01_000001 Exit code: 1 Stack trace: ExitCodeException exitCode=1: at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) at org.apache.hadoop.util.Shell.run(Shell.java:504) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:786) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

Container exited with a non-zero exit code 1 Failing this attempt. Failing the application. If log aggregation is enabled on your cluster, use this command to further investigate the issue: yarn logs -applicationId application_1603495749855_54023 at org.apache.flink.yarn.YarnClusterDescriptor.startAppMaster(YarnClusterDescriptor.java:1021) at org.apache.flink.yarn.YarnClusterDescriptor.deployInternal(YarnClusterDescriptor.java:524) at org.apache.flink.yarn.YarnClusterDescriptor.deploySessionCluster(YarnClusterDescriptor.java:375) ... 7 more 2020-12-08 18:06:00,171 INFO org.apache.flink.yarn.YarnClusterDescriptor [] - Cancelling deployment from Deployment Failure Hook

........................

【具体log】 嵌入yarn logs -applicationId application_1603495749855_54023 查询log 如下: Container: container_1603495749855_54018_01_000001 on ******.mercury.corp_8041

LogType:jobmanager.err Log Upload Time:Tue Dec 08 17:49:33 -0800 2020 LogLength:160 Log Contents: Unrecognized VM option 'MaxMetaspaceSize=268435456' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit.

LogType:jobmanager.out Log Upload Time:Tue Dec 08 17:49:33 -0800 2020 LogLength:0 Log Contents:


【疑惑】 根据log,好像是说java版本不对,Unrecognized VM option 'MaxMetaspaceSize=268435456' 该参数只在1.8以上存在,但我的java就是1.8+的。不知道为什么不能启动。 相同的命令,在1.7.2 flink客户端是可以成功启动 【备注】 flink1.7.2同时在使用中,并连接Hadoop在运行flink job 不知道和这个有关系没。

谢谢!

*来自志愿者整理的flink邮件归档

展开
收起
小阿怪 2021-12-06 12:17:19 1884 0
1 条回答
写回答
取消 提交回答
  • 该问题已经fix,确实是java版本问题!*来自志愿者整理的flink邮件归档

    2021-12-06 13:20:38
    赞同 展开评论 打赏
问答排行榜
最热
最新

相关电子书

更多
深度学习+大数据 TensorFlow on Yarn 立即下载
Docker on Yarn 微服务实践 立即下载
深度学习+大数据-TensorFlow on Yarn 立即下载