开发者社区> 问答> 正文

flink用IDEA本地运行可以读取HDFS数据,然后把项目打包提交到flink集群,无法读取HDFS数据,出现以下错误,这是为何?

八戒八戒2333 2019-06-06 13:19:23 943
 The program finished with the following exception:

org.apache.flink.client.program.ProgramInvocationException: Job failed. (JobID: 74a2d820909fee963c4dea371b5c236c)
    at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:268)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:483)
    at org.apache.flink.streaming.api.environment.StreamContextEnvironment.execute(StreamContextEnvironment.java:66)
    at org.apache.flink.streaming.api.scala.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.scala:654)
    at org.myflink.quickstart.WordCount$.main(WordCount.scala:20)
    at org.myflink.quickstart.WordCount.main(WordCount.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
    at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
    at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423)
    at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
    at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
    at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
    at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
    at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
    at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
    at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)
Caused by: org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
    at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
    at org.apache.flink.client.program.rest.RestClusterClient.submitJob(RestClusterClient.java:265)
    ... 19 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded.
    at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:403)
    at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:318)
    at org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction.run(ContinuousFileMonitoringFunction.java:196)
    at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:93)
    at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:57)
    at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:97)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies.
    at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:64)
    at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:399)
    ... 8 more

本地bashrc已经配置了

HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

flink-conf.yaml也已经做了一下配置

env.hadoop.conf.dir=/usr/local/hadoop/etc/hadoop

请问这是什么原因呀?

分享到
取消 提交回答
全部回答(2)
  • 不语奈何
    2019-09-03 11:26:11

    JAR包里面有code-default。xml文件。修正下。这个文件导致的无法识别hdfs地址无法获取文件

    0 0
  • 阿森纳不可战胜
    2019-07-17 23:36:50

    我用的flink-1.7.2版本和hadoop-2.7.2,hadoop_conf_dir和env也配置了,但读取hdfs上数据也报这个错误。后来在flink/lib下添加官网下载的flink和hadoop匹配的flink-shaded-hadoop2-uber-1.7.2.jar包,就不报错了。不知道你的问题是不是也能这样解决

    0 0
添加回答
阿里云实时计算
使用钉钉扫一扫加入圈子
+ 订阅

一套基于Apache Flink构建的一站式、高性能实时大数据处理平台,广泛适用于流式数据处理、离线数据处理、DataLake计算等场景。

推荐文章
相似问题
链接