开发者社区 问答 正文

基于k8s安装的flink无法访问hdfs

flink基于k8s安装,hadoop原生安装集群,将任务放到flink上执行的时候,报错不能使用checkpoint,详细报错如下: Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find a file system implementation for scheme 'hdfs'. The scheme is not directly supported by Flink and no Hadoop file system to support this scheme could be loaded. For a full list of supported file systems, please see https://ci.apache.org/projects/flink/flink-docs-stable/ops/filesystems/. at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:530) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:407) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.state.filesystem.FsCheckpointStorageAccess. (FsCheckpointStorageAccess.java:64) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.state.filesystem.FsStateBackend.createCheckpointStorage(FsStateBackend.java:527) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.StreamTask. (StreamTask.java:337) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.StreamTask. (StreamTask.java:304) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask. (SourceStreamTask.java:76) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask. (SourceStreamTask.java:72) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_292] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_292] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_292] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_292] at org.apache.flink.runtime.taskmanager.Task.loadAndInstantiateInvokable(Task.java:1515) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:727) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Hadoop is not in the classpath/dependencies. at org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:55) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:526) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:407) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.state.filesystem.FsCheckpointStorageAccess. (FsCheckpointStorageAccess.java:64) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.state.filesystem.FsStateBackend.createCheckpointStorage(FsStateBackend.java:527) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.StreamTask. (StreamTask.java:337) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.StreamTask. (StreamTask.java:304) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask. (SourceStreamTask.java:76) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.streaming.runtime.tasks.SourceStreamTask. (SourceStreamTask.java:72) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ~[?:1.8.0_292] at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) ~[?:1.8.0_292] at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) ~[?:1.8.0_292] at java.lang.reflect.Constructor.newInstance(Constructor.java:423) ~[?:1.8.0_292] at org.apache.flink.runtime.taskmanager.Task.loadAndInstantiateInvokable(Task.java:1515) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:727) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at org.apache.flink.runtime.taskmanager.Task.run(Task.java:563) ~[flink-dist_2.12-1.13.0.jar:1.13.0] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292]

请求大佬看一下,主要是这两个集群,一个是k8s安装的,一个不是。

展开
收起
游客v4spwg6razmti 2021-09-27 17:19:27 1103 分享 版权
阿里云 AI 助理回答

根据您提供的错误信息,问题在于Flink在尝试使用HDFS作为文件系统进行checkpoint操作时,无法找到支持'hdfs'方案的文件系统实现。这通常是因为Flink运行环境中缺少对Hadoop的依赖,导致它无法访问HDFS。

考虑到您的环境是Flink基于Kubernetes(k8s)安装,而Hadoop集群是原生安装的,这里有几个可能的解决步骤:

  1. 确保Hadoop客户端库已添加到Flink镜像中

    • 在构建Flink容器镜像时,需要包含Hadoop客户端库。这样Flink才能识别并使用HDFS。可以通过修改Dockerfile,加入类似以下命令来添加Hadoop客户端依赖:
      # 以Apache Hadoop为例
      ADD https://downloads.apache.org/hadoop/common/hadoop-3.x.x/hadoop-3.x.x.tar.gz /opt/
      RUN tar -xzf hadoop-3.x.x.tar.gz -C /opt/ && ln -s /opt/hadoop-3.x.x /opt/hadoop
      ENV HADOOP_HOME=/opt/hadoop
      ENV PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
      

      记得替换3.x.x为实际的Hadoop版本号,并确保该版本与您的Hadoop集群兼容。

  2. 配置Flink以使用Hadoop依赖

    • 在Flink的配置文件(如flink-conf.yaml)中,确保正确设置了Hadoop相关路径,例如Hadoop的配置目录(HADOOP_CONF_DIR),如果这些配置是在容器外部管理的,可能需要通过挂载卷的方式将Hadoop配置文件(如core-site.xmlhdfs-site.xml)映射到Flink容器内。
  3. 检查网络连接

    • 确保Flink所在的Kubernetes集群能够访问到Hadoop集群的HDFS服务。这包括网络策略、防火墙规则以及DNS解析等。
  4. 使用Hadoop插件或Sidecar容器

    • 另一种方法是在Flink任务Pod中使用Hadoop客户端的Sidecar容器,或者利用Kubernetes的Init Containers来初始化Hadoop环境,确保所有必要的配置和依赖都就绪。
  5. 考虑使用更集成的解决方案

    • 考虑到阿里云提供了E-MapReduce服务,它集成了Hadoop、Spark、Hive等大数据组件,并且支持与Kubernetes服务(ACK)集成,如果您不介意重新部署或迁移,这可能是简化管理和避免此类问题的一个长期解决方案。

请按照上述建议逐步排查和解决问题,希望这些建议能帮助您成功配置Flink以使用HDFS进行checkpoint。

有帮助
无帮助
AI 助理回答生成答案可能存在不准确,仅供参考
0 条回答
写回答
取消 提交回答