在使用E-Mapreduce产品时,出现如下问题:
java.lang.IllegalArgumentException: Wrong FS: oss://id:key@testemr.oss-cn-hangzhou-internal.aliyuncs.com/output5, expected: hdfs://ip:9000
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:645)
at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:193)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFileSystem.java:105)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1118)
at org.apache.hadoop.hdfs.DistributedFileSystem$18.doCall(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1114)
at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1400)
at com.aliyun.emr.example.WordCount.main(WordCount.java:129)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
代码为,出错的行为: if (fs.exists(outputPath)) {
List<String> otherArgs = new ArrayList<String>();
for (int i = 0; i < remainingArgs.length; ++i) {
if ("-skip".equals(remainingArgs[i])) {
job.addCacheFile(new Path(EMapReduceOSSUtil.buildOSSCompleteUri(remainingArgs[++i], conf)).toUri());
job.getConfiguration().setBoolean("wordcount.skip.patterns", true);
} else {
otherArgs.add(remainingArgs[i]);
}
}
Path outputPath = new Path(EMapReduceOSSUtil.buildOSSCompleteUri(otherArgs.get(1), conf));
org.apache.hadoop.fs.FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);
if (fs.exists(outputPath)) {
fs.delete(outputPath, true);
}
FileInputFormat.addInputPath(job, new Path(EMapReduceOSSUtil.buildOSSCompleteUri(otherArgs.get(0), conf)));
FileOutputFormat.setOutputPath(job, new Path(EMapReduceOSSUtil.buildOSSCompleteUri(otherArgs.get(1), conf)));
System.exit(job.waitForCompletion(true) ? 0 : 1);
这个一般就是:访问 oss://id:key@testemr.oss-cn-hangzhou-internal.aliyuncs.com/output5 但是拿到了DistributedFileSystem
看具体的代码:
Path outputPath = new Path(EMapReduceOSSUtil.buildOSSCompleteUri(otherArgs.get(1), conf));
org.apache.hadoop.fs.FileSystem fs = org.apache.hadoop.fs.FileSystem.get(conf);
if (fs.exists(outputPath)) {
fs.delete(outputPath, true);
}
因为 outputPath 为: oss://id:key@testemr.oss-cn-hangzhou-internal.aliyuncs.com/output5 实际 fs为:DistributedFileSystem 就会出现如上的问题。
解决的办法应该写成下面的,用具体的路径拿到FileSystem
FileSystem.get(conf); 改为 FileSystem.get(outputPath.toUri(), conf);
Path outputPath = new Path(EMapReduceOSSUtil.buildOSSCompleteUri(otherArgs.get(1), conf));
org.apache.hadoop.fs.FileSystem fs = org.apache.hadoop.fs.FileSystem.get(outputPath.toUri(), conf);
if (fs.exists(outputPath)) {
fs.delete(outputPath, true);
}
以前一般只有一个HDFS;现在除了有HDFS,还有OSS,还请大家注意。
版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。