安装hadoop
准备机器:一台master,若干台slave,配置每台机器的/etc/hosts保证各台机器之间通过机器名可以互访,例如:
172.16.200.4 node1(master)
172.16.200.5 node2 (slave1)
172.16.200.6 node3 (slave2)
主机信息:
机器名 | IP地址 | 作用 |
master | 172.16.200.4 | NameNode、JobTracker |
Node1 | 172.16.200.4 | NameNode、JobTracker |
Node2 | 172.16.200.5 | DataNode、TaskTracker |
Node3 | 172.16.200.6 | DataNode、TaskTracker |
一、修改主机名
(三台都配置)
以node1为例,其它两台做同样配置。
vim /etc/hosts
172.16.200.4 node1
172.16.200.5 node2
172.16.200.6 node3
vim /etc/sysconfig/network
hostname= node1
重新登陆使之生效 hostname node1
ping 主机名 ---------验证
二、添加hadoop用户,并赋予root权限
(三台都配置)
useradd hadoop
passwd hadoop (密码和用户名一样)
修改 /etc/sudoers 文件,找到下面一行,在root下面添加一行,如下所示:
root ALL=(ALL) ALL
hadoop ALL=(ALL) ALL
三、配置免密码登陆
(三台都配置)
在Hadoop启动以后,Namenode是通过SSH(Secure Shell)来启动和停止各个datanode上的各种守护进程的,这就须要在节点之间执行指令的时候是不须要输入密码的形式,故我们须要配置SSH运用无密码公钥认证的形式。
以本文中的三台机器为例,现在node1是主节点,他须要连接node2和node3。须要确定每台机器上都安装了ssh,并且datanode机器上sshd服务已经启动。
( 说明:hadoop@hadoop~]$ssh-keygen -t rsa
这个命令将为hadoop上的用户hadoop生成其密钥对,询问其保存路径时直接回车采用默认路径,当提示要为生成的密钥输入passphrase的时候,直接回车,也就是将其设定为空密码。生成的密钥对id_rsa,id_rsa.pub,默认存储在/home/hadoop/.ssh目录下然后将id_rsa.pub的内容复制到每个机器(也包括本机)的/home/dbrg/.ssh/authorized_keys文件中,如果机器上已经有authorized_keys这个文件了,就在文件末尾加上id_rsa.pub中的内容,如果没有authorized_keys这个文件,直接复制过去就行.)
四、 安装jdk (三台都安装)
export JAVA_HOME=/usr/java/jdk1.7.0_67
exportPATH=$PATH:$HADOOP_HOME/bin
export JRE_HOME=/usr/java/jdk1.7.0_67/jre
source /etc/profile
五、安装Hadoop
这是下载后的hadoop-2.6.4.tar.gz压缩包,
1、解压 tar -xzvf hadoop-2.6.4.tar.gz
这是下载后的hadoop-2.6.4.tar.gz压缩包,
1、解压 tar -xzvf hadoop-2.6.4.tar.gz
[hadoop@node1 hadoop-2.6.4]$ ls
bin data etc include lib libexec LICENSE.txt logs name NOTICE.txt README.txt sbin share var
[hadoop@node1 hadoop-2.6.4]$ pwd
/home/hadoop/hadoop-2.6.4
/home/hadoop/hadoop-2.6.4/var
/home/hadoop/hadoop-2.6.4/data
/home/hadoop/hadoop-2.6.4/var
/name
3、编辑配置文件主要涉及的配置文件有7个:都在/home/hadoop/hadoop-2.6.4/etc/hadoop文件夹下
4、进去hadoop配置文件目录
a、配置 hadoop-env.sh文件-->修改JAVA_HOME
4.2、配置 yarn-env.sh 文件-->>修改JAVA_HOME
4.3、配置slaves文件-->>增加slave节点
01 node1
02 node2
03 node3
4.4、配置 core-site.xml文件-->>增加hadoop核心配置(hdfs文件端口是9000、file: /home/hadoop/hadoop-2.6.4/var )
<configuration>
<property>
<name>fs.defaultFS</name>
<value>
hdfs://node1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>
file:/home/hadoop/hadoop-2.6.4/var</value>
<description>Abasefor other temporary directories.</description>
</property>
<property>
<name>hadoop.proxyuser.spark.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.spark.groups</name>
<value>*</value>
</property>
</configuration>
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node1:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/hadoop-2.6.4/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/hadoop/hadoop-2.6.4/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
4.6、配置 mapred-site.xml 文件-->>增加mapreduce配置(使用yarn框架、jobhistory使用地址以及web地址)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>node1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>node1:19888</value>
</property>
</configuration>
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>node1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>node1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>node1:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>node1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>node1:8088</value>
</property>
</configuration>
scp -r hadoop-2.6.4 hadoop@node2:/home/hadoop/
scp -r hadoop-2.6.4 hadoop@node3:/home/hadoop/
四、验证
1、格式化namenode:
2、启动hdfs:
3、停止hdfs:
4、启动yarn:
5、停止yarn:
6、查看集群状态:
[hadoop@node1 hadoop-2.6.4]$ ./bin/hdfs dfsadmin -report
16/05/26 10:51:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 56338194432 (52.47 GB)
Present Capacity: 42922237952 (39.97 GB)
DFS Remaining: 42922164224 (39.97 GB)
DFS Used: 73728 (72 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Live datanodes (3):
Name: 172.16.200.4:50010 (node1)
Hostname: node1
Decommission Status : Normal
Configured Capacity: 18779398144 (17.49 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4559396864 (4.25 GB)
DFS Remaining: 14219976704 (13.24 GB)
DFS Used%: 0.00%
DFS Remaining%: 75.72%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu May 26 10:51:35 CST 2016
Name: 172.16.200.5:50010 (node2)
Hostname: node2
Decommission Status : Normal
Configured Capacity: 18779398144 (17.49 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4369121280 (4.07 GB)
DFS Remaining: 14410252288 (13.42 GB)
DFS Used%: 0.00%
DFS Remaining%: 76.73%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu May 26 10:51:35 CST 2016
Name: 172.16.200.6:50010 (node3)
Hostname: node3
Decommission Status : Normal
Configured Capacity: 18779398144 (17.49 GB)
DFS Used: 24576 (24 KB)
Non DFS Used: 4487438336 (4.18 GB)
DFS Remaining: 14291935232 (13.31 GB)
DFS Used%: 0.00%
DFS Remaining%: 76.10%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Thu May 26 10:51:35 CST 2016
8、查看RM:http://172.16.200.4:8088/