今天在学习如何从hadoop中读取数据时,写了一个简单的方法,测试时,却报以下错误:
以下是读取hadoop中文件并写入本地磁盘的代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
package
hdfs;
import
java.io.BufferedReader;
import
java.io.FileWriter;
import
java.io.InputStream;
import
java.io.InputStreamReader;
import
java.net.URL;
import
org.apache.hadoop.io.IOUtils;
public
class
HDFS {
public
static
void
main(String[] args)
throws
Exception {
InputStream inputStream =
null
;
FileWriter writer =
null
;
try
{
URL url =
new
URL(
"hdfs://localhost:9000/input.txt"
);
inputStream = url.openStream();
writer =
new
FileWriter(
"/home/wxl/桌面/tmp.txt"
);
InputStreamReader reader =
new
InputStreamReader(inputStream);
BufferedReader bufferedReader =
new
BufferedReader(reader);
String line =
null
;
while
((line = bufferedReader.readLine()) !=
null
) {
writer.write(line);
}
}
finally
{
IOUtils.closeStream(inputStream);
if
(writer !=
null
) {
writer.close();
}
}
}
}
|
几经周折,在《Hadoop权威指南》中找到这样的结果:
"There’s a little bit more work required to make Java recognize Hadoop’s hdfs URL scheme. This is achieved by calling the setURLStreamHandlerFactory method on URL with an instance of FsUrlStreamHandlerFactory . This method can be called only once per JVM, so it is typically executed in a static block."
意即:“让Java程序能够识别Hadoop的hdfs URL方案还需要一些额外的工作,这里采用的方法是通过FsUrlStreamHandlerFactory实例调用URL中的setURLStreamHandlerFactory方法。由于Java虚拟机只能调用一次上述方法,因此通常在静态方法中调用上述方法。”
于是,在类中加入静态执行块:
1
2
3
4
|
static
{
// This method can be called at most once in a given JVM.
URL.setURLStreamHandlerFactory(
new
FsUrlStreamHandlerFactory());
}
|
因此代码变成了下面这样:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|
package
hdfs;
import
java.io.BufferedReader;
import
java.io.FileWriter;
import
java.io.InputStream;
import
java.io.InputStreamReader;
import
java.net.URL;
import
org.apache.hadoop.fs.FsUrlStreamHandlerFactory;
import
org.apache.hadoop.io.IOUtils;
public
class
HDFS {
static
{
// This method can be called at most once in a given JVM.
URL.setURLStreamHandlerFactory(
new
FsUrlStreamHandlerFactory(););
}
public
static
void
main(String[] args)
throws
Exception {
InputStream inputStream =
null
;
FileWriter writer =
null
;
try
{
URL url =
new
URL(
"hdfs://localhost:9000/input.txt"
);
inputStream = url.openStream();
writer =
new
FileWriter(
"/home/wxl/桌面/tmp.txt"
);
InputStreamReader reader =
new
InputStreamReader(inputStream);
BufferedReader bufferedReader =
new
BufferedReader(reader);
String line =
null
;
while
((line = bufferedReader.readLine()) !=
null
) {
writer.write(line);
}
}
finally
{
IOUtils.closeStream(inputStream);
if
(writer !=
null
) {
writer.close();
}
}
}
}
|
OK,不再报错,成功运行。