HDFS 自定义实现函数将文件追加到末尾的问题：-阿里云开发者社区

HDFS 自定义实现函数将文件追加到末尾的问题：

2023-01-02 227

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： HDFS 自定义实现函数将文件追加到末尾的问题：

HDFS 自定义实现函数将文件追加到末尾的问题：

一、实验环境：

Ubuntu16.04
Hadoop2.7.1 伪分布式（`只有一个DN`）
Eclipse3.8

二、解决方案

Java代码：

importorg.apache.hadoop.conf.Configuration;
importorg.apache.hadoop.fs.*;
importjava.io.*;
publicclassHDFSApi {
/*** 判断路径是否存在*/publicstaticbooleantest(Configurationconf, Stringpath) throwsIOException {
FileSystemfs=FileSystem.get(conf);
returnfs.exists(newPath(path));
    }
/*** 复制文件到指定路径* 若路径已存在，则进行覆盖*/publicstaticvoidcopyFromLocalFile(Configurationconf, StringlocalFilePath, StringremoteFilePath) throwsIOException {
FileSystemfs=FileSystem.get(conf);
PathlocalPath=newPath(localFilePath);
PathremotePath=newPath(remoteFilePath);
/* fs.copyFromLocalFile 第一个参数表示是否删除源文件，第二个参数表示是否覆盖 */fs.copyFromLocalFile(false, true, localPath, remotePath);
fs.close();
    }
/*** 追加文件内容*/publicstaticvoidappendToFile(Configurationconf, StringlocalFilePath, StringremoteFilePath) throwsIOException {
FileSystemfs=FileSystem.get(conf);
PathremotePath=newPath(remoteFilePath);
/* 创建一个文件读入流 */FileInputStreamin=newFileInputStream(localFilePath);
/* 创建一个文件输出流，输出的内容将追加到文件末尾 */FSDataOutputStreamout=fs.append(remotePath);
/* 读写文件内容 */byte[] data=newbyte[1024];
intread=-1;
while ( (read=in.read(data)) >0 ) {
out.write(data, 0, read);
        }
out.close();
in.close();
fs.close();
    }
/*** 主函数*/publicstaticvoidmain(String[] args) {
Configurationconf=newConfiguration();
conf.set("fs.default.name","hdfs://localhost:9000");
StringlocalFilePath="/home/hadoop/text.txt";    // 本地路径StringremoteFilePath="/user/hadoop/text.txt";    // HDFS路径Stringchoice="append";    // 若文件存在则追加到文件末尾//      String choice = "overwrite";    // 若文件存在则覆盖try {
/* 判断文件是否存在 */BooleanfileExists=false;
if (HDFSApi.test(conf, remoteFilePath)) {
fileExists=true;
System.out.println(remoteFilePath+" 已存在.");
            } else {
System.out.println(remoteFilePath+" 不存在.");
            }
/* 进行处理 */if ( !fileExists) { // 文件不存在，则上传HDFSApi.copyFromLocalFile(conf, localFilePath, remoteFilePath);
System.out.println(localFilePath+" 已上传至 "+remoteFilePath);
            } elseif ( choice.equals("overwrite") ) {    // 选择覆盖HDFSApi.copyFromLocalFile(conf, localFilePath, remoteFilePath);
System.out.println(localFilePath+" 已覆盖 "+remoteFilePath);
            } elseif ( choice.equals("append") ) {   // 选择追加HDFSApi.appendToFile(conf, localFilePath, remoteFilePath);
System.out.println(localFilePath+" 已追加至 "+remoteFilePath);
            }
        } catch (Exceptione) {
e.printStackTrace();
        }
    }
}

报错信息：Failed to replace a bad datanode the existing pipeline to no more good datanodes begin g available to try.

直观判定为文件在pineline传输中DN被认为是坏的数据节点，需要新的好的数据节点来确保文件在pineline中传输正常。

官网说明：`hdfs-default.xml`配置文件

如果写入管道中存在数据节点/网络故障，DFSClient 将尝试从管道中删除失败的数据节点，然后继续使用其余数据节点进行写入。因此，管道中的数据节点数会减少。该功能是向管道添加新的数据节点。这是用于`启用/禁用该功能的站点范围的属性(dfs.client.block.write.replace-datanode-on-failure.policy)`。当集群大小非常小时（例如 `3 个节点或更少`），集群管理员可能希望在默认配置文件中将策略设置为 `NEVER` 或禁用此功能。否则，用户可能会遇到异常高的管道故障率，因为无法找到新的数据节点进行替换。

而且，仅当 dfs.client.block.write.replace-datanode-on-failure.enable 的值为 true 时，才使用此属性。 ALWAYS ：删除现有数据节点时，始终添加新的数据节点。 NEVER ：从不添加新的数据节点。默认值：让 r 作为复制编号。设 n 为现有数据节点的数量。仅当 r 大于或等于 3 且（1） floor（r/2）大于或等于 n 时，才添加新的数据节点;或（2） r 大于 n，并且块被hflushed/appended。

方法一：在Java代码main函数中加入以下两行代码：

conf.set("dfs.client.block.write.replace-datanode-on-failure.policy","NEVER"); 
conf.set("dfs.client.block.write.replace-datanode-on-failure.enable","true");

方法二：在hdfs-site.xml中加入以下代码：

<property><name>dfs.client.block.write.replace-datanode-on-failure.policy</name><value>NEVER</value></property>

三、注意点

一般来说，如果集群中DN个数小于等于3 （本机器采用伪分布式模式，只有一个DN，但是为了测试方便，直接开启即可）都不建议开启

结束！

HDFS 自定义实现函数将文件追加到末尾的问题：

HDFS 自定义实现函数将文件追加到末尾的问题：

一、实验环境：

二、解决方案

三、注意点

热门文章

最新文章

相关课程

相关电子书

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

HDFS 自定义实现函数将文件追加到末尾的问题：

HDFS 自定义实现函数将文件追加到末尾的问题：

一、实验环境：

二、解决方案

三、注意点

热门文章

最新文章

相关课程

相关电子书