开发者社区 问答 正文

用java读取hdfs的.lzo_deflate文件报错?报错

linux环境没有问题,hadoop环境、配置也没有问题,并且通过hdoop fs -text 指令能正常打开该压缩文件。但是用java读取就报错了,请大神帮忙看看,谢谢

代码如下:
public static void main(String[] args) {
String uri = "/daas/****/MBLDPI3G.2016081823_10.1471532401822.lzo_deflate";
Configuration conf = new Configuration();
String path = "/software/servers/hadoop-2.6.3-bin/hadoop-2.6.3/etc/hadoop/";
conf.addResource(new Path(path + "core-site.xml"));
conf.addResource(new Path(path + "hdfs-site.xml"));
conf.addResource(new Path(path + "mapred-site.xml"));
try {
CompressionCodecFactory factory = new CompressionCodecFactory(conf);

CompressionCodec codec = factory.getCodec(new Path(uri));
        if (codec == null) {
            System.out.println("Codec for " + uri + " not found.");
        } else {
            CompressionInputStream in = null;
            try {
                in = codec.createInputStream(new java.io.FileInputStream(uri));
                byte[] buffer = new byte[100];
                int len = in.read(buffer);
                while (len > 0) {
                    System.out.write(buffer, 0, len);
                    len = in.read(buffer);
                }
            } finally {
                if (in != null) {
                    in.close();
                }
            }
        }

    } catch (Exception e) {
        e.printStackTrace();
    }
}

报错信息如下:
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.NativeCodeLoader).

log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
java.io.FileNotFoundException: /daas/***/MBLDPI3G.2016081823_10.1471532401822.lzo_deflate (没有那个文件或目录)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:146)
at java.io.FileInputStream.(FileInputStream.java:101)
at FileDecompressor.main(FileDecompressor.java:53)

加载的jar包:
    <classpathentry kind="lib" path="lib/commons-cli-1.2.jar"/>
<classpathentry kind="lib" path="lib/commons-collections-3.2.2.jar"/>
<classpathentry kind="lib" path="lib/commons-configuration-1.6.jar"/>
<classpathentry kind="lib" path="lib/commons-lang-2.6.jar"/>
<classpathentry kind="lib" path="lib/commons-logging-1.1.3.jar"/>
<classpathentry kind="lib" path="lib/guava-18.0.jar"/>
<classpathentry kind="lib" path="lib/hadoop-auth-2.6.3.jar"/>
<classpathentry kind="lib" path="lib/hadoop-common-2.6.3.jar"/>
<classpathentry kind="lib" path="lib/hadoop-hdfs-2.6.3.jar"/>
<classpathentry kind="lib" path="lib/htrace-core-3.0.4.jar"/>
<classpathentry kind="lib" path="lib/log4j-1.2.17.jar"/>
<classpathentry kind="lib" path="lib/protobuf-java-2.5.0.jar"/>
<classpathentry kind="lib" path="lib/slf4j-api-1.7.5.jar"/>
<classpathentry kind="lib" path="lib/slf4j-log4j12-1.7.5.jar"/>
<classpathentry kind="lib" path="lib/hadoop-lzo-0.4.20.jar"/>

展开
收起
爱吃鱼的程序员 2020-06-09 10:40:03 791 分享 版权
1 条回答
写回答
取消 提交回答
  • https://developer.aliyun.com/profile/5yerqm5bn5yqg?spm=a2c6h.12873639.0.0.6eae304abcjaIB

    你读取的是hdfs下的文件,肯定只能指定hdfs的路径,直接去看的话,下面肯定是没有文件或者是不被识别的。

    试试这个:

    Stringuri="hdfs:/localhost:9000/daas/****/MBLDPI3G.2016081823_10.1471532401822.lzo_deflate";

    谢谢兄弟!不过文件路径是没有问题的哦,在命令行输入Hadoopfs-text是能看到文件的哦仔细看我的回答,跟你的路径还是有差异的。你好,你也是在用电信云公司的那个服务器吧,我看到你的文件以及路径就猜测,我们用的是同一个数据,我也遇到了这个问题,无法用spark或者hadoop(java)读取这个lzo_deflate文件,我已经在这个上面耗费了两天的时间了,你有解决办法吗。另外lzo_deflate是不支持分片split读取的,也就是说如果我能够成功的读取lzo_deflate文件我就只能用一个map去处理一个lzo_deflate文件,请问你的问题解决了,如果解决了希望您能私信我或者跪求发到我的邮箱1171856576@qq.com邮箱和我讨论一下,谢谢啦
    2020-06-09 10:40:18
    赞同 展开评论