一. 异常信息
2018-04-22 07:50:23,841 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for Block pool <registering> (Datanode Uuid unassigned) service to master/192.168.175.20:9000. Exiting. java.io.IOException: All specified directories are failed to load. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:478) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1338) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1304) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:314) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:226) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:867) at java.lang.Thread.run(Thread.java:745)
日志文件存放路径: /usr/local/src/hadoop-2.6.1/logs
二. 解决办法:
经查,master节点下$HADOOP_HOME/dfs/data/
内容为空;而slave1和slave2节点,$HADOOP_HOME/dfs/data/
内容为空;分析是由于master节点和slave节点初始化数据不一致,才导致slave上的datanode启动失败.
经过查询,有以下两种解决方案:
方法1.进入dfs/data/,修改VERSION文件即可,将nameNode里version文件夹里面的内容修改成和master一致的。(由于我的master节点文件下目录为空,此方法不适用)
方法2.直接删除dfs/data,然后格式化集群,重新启动即可(./hadoop namenode -format).