硬件环境:四台虚拟机,hadoop1~hadoop4,3G内存,60G硬盘,2核CPU
软件环境:CentOS6.5,hadoop-2.6.0-cdh5.8.2,JDK1.7
部署规划:
hadoop1(192.168.0.3):namenode(active)、resourcemanager
hadoop2(192.168.0.4):namenode(standby)、journalnode、datanode、nodemanager、historyserver
hadoop3(192.168.0.5):journalnode、datanode、nodemanager
hadoop4(192.168.0.6):journalnode、datanode、nodemanager
HDFS的HA采用QJM的方式(journalnode):
一、系统准备
1、每台机关闭selinux
#vi /etc/selinux/config
SELINUX=disabled
2、每台机关闭防火墙(切记要关闭,否则格式化hdfs时会报错无法连接journalnode)
#chkconfig iptables off
#service iptables stop
3、每台机安装jdk1.7
#cd /software
#tar -zxf jdk-7u65-linux-x64.gz -C /opt/
#cd /opt
#ln -s jdk-7u65-linux-x64.gz java
#vi /etc/profile
export JAVA_HOME=/opt/java
export PATH=$PATH:$JAVA_HOME/bin
4、每台机建立hadoop相关用户,并建立互信
#useradd grid
#passwd grid
(建立互信步骤略)
5、每台机建立相关目录
#mkdir -p /hadoop_data/hdfs/name
#mkdir -p /hadoop_data/hdfs/data
#mkdir -p /hadoop_data/hdfs/journal
#mkdir -p /hadoop_data/yarn/local
#chown -R grid:grid /hadoop_data
二、hadoop部署
HDFS HA主要是指定nameservices(如果不做HDFS ferderation,就只会有一个ID),同时指定该
nameserviceID下面的两个namenode及其地址。此处的nameservice名设置为hadoop-spark
1、每台机解压hadoop包
#cd /software
#tar -zxf hadoop-2.6.0-cdh5.8.2.tar.gz -C /opt/
#cd /opt
#chown -R grid:grid hadoop-2.6.0-cdh5.8.2
#ln -s hadoop-2.6.0-cdh5.8.2 hadoop
2、切换到grid用户继续操作
#su - grid
$cd /opt/hadoop/etc/hadoop
3、配置hadoop-env.sh(其实只配置JAVA_HOME)
$vi hadoop-env.sh
# The java implementation to use.
export JAVA_HOME=/opt/java
4、设置hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>hadoop-spark</value>
<description>
Comma-separated list of nameservices.
</description>
</property>
<property>
<name>dfs.ha.namenodes.hadoop-spark</name>
<value>nn1,nn2</value>
<description>
The prefix for a given nameservice, contains a comma-separated
list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
</description>
</property>
<property>
<name>dfs.namenode.rpc-address.hadoop-spark.nn1</name>
<value>hadoop1:8020</value>
<description>
RPC address for nomenode1 of hadoop-spark
</description>
</property>
<property>
<name>dfs.namenode.rpc-address.hadoop-spark.nn2</name>
<value>hadoop2:8020</value>
<description>
RPC address for nomenode2 of hadoop-spark
</description>
</property>
<property>
<name>dfs.namenode.http-address.hadoop-spark.nn1</name>
<value>hadoop1:50070</value>
<description>
The address and the base port where the dfs namenode1 web ui will listen on.
</description>
</property>
<property>
<name>dfs.namenode.http-address.hadoop-spark.nn2</name>
<value>hadoop2:50070</value>
<description>
The address and the base port where the dfs namenode2 web ui will listen on.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///hadoop_data/hdfs/name</value>
<description>Determines where on the local filesystem the DFS name node
should store the name table(fsimage). If this is a comma-delimited list
of directories then the name table is replicated in all of the
directories, for redundancy. </description>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop2:8485;hadoop3:8485;hadoop4:8485/hadoop-spark</value>
<description>A directory on shared storage between the multiple namenodes
in an HA cluster. This directory will be written by the active and read
by the standby in order to keep the namespaces synchronized. This directory
does not need to be listed in dfs.namenode.edits.dir above. It should be
left empty in a non-HA cluster.
</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///hadoop_data/hdfs/data</value>
<description>Determines where on the local filesystem an DFS data node
should store its blocks. If this is a comma-delimited
list of directories, then data will be stored in all named
directories, typically on different devices.
Directories that do not exist are ignored.
</description>
</property>
<!-- 这个如果不设置,会造成无法直接通过nameservice名称来访问hdfs,只能直接写active的namenode地址 -->
<property>
<name>dfs.client.failover.proxy.provider.hadoop-spark</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>false</value>
<description>
Whether automatic failover is enabled. See the HDFS High
Availability documentation for details on automatic HA
configuration.
</description>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/hadoop_data/hdfs/journal</value>
</property>
</configuration>
5、配置core-site.xml(配置fs.defaultFS,使用HA的nameservices名称)
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-spark</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
6、配置mapred-site.xml
<configuration>
<!-- MR YARN Application properties -->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>The runtime framework for executing MapReduce jobs.
Can be one of local, classic or yarn.
</description>
</property>
<!-- jobhistory properties -->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop2:10020</value>
<description>MapReduce JobHistory Server IPC host:port</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop2:19888</value>
<description>MapReduce JobHistory Server Web UI host:port</description>
</property>
</configuration>
7、配置yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<!-- Resource Manager Configs -->
<property>
<description>The hostname of the RM.</description>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
</property>
<property>
<description>The address of the applications manager interface in the RM.</description>
<name>yarn.resourcemanager.address</name>
<value>${yarn.resourcemanager.hostname}:8032</value>
</property>
<property>
<description>The address of the scheduler interface.</description>
<name>yarn.resourcemanager.scheduler.address</name>
<value>${yarn.resourcemanager.hostname}:8030</value>
</property>
<property>
<description>The http address of the RM web application.</description>
<name>yarn.resourcemanager.webapp.address</name>
<value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
<description>The https adddress of the RM web application.</description>
<name>yarn.resourcemanager.webapp.https.address</name>
<value>${yarn.resourcemanager.hostname}:8090</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
<property>
<description>The address of the RM admin interface.</description>
<name>yarn.resourcemanager.admin.address</name>
<value>${yarn.resourcemanager.hostname}:8033</value>
</property>
<property>
<description>The class to use as the resource scheduler.</description>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<description>fair-scheduler conf location</description>
<name>yarn.scheduler.fair.allocation.file</name>
<value>${yarn.home.dir}/etc/hadoop/fairscheduler.xml</value>
</property>
<property>
<description>List of directories to store localized files in. An
application's localized file directory will be found in:
${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
Individual containers' work directories, called container_${contid}, will
be subdirectories of this.
</description>
<name>yarn.nodemanager.local-dirs</name>
<value>/hadoop_data/yarn/local</value>
</property>
<property>
<description>Whether to enable log aggregation</description>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<description>Where to aggregate logs to.</description>
<name>yarn.nodemanager.remote-app-log-dir</name>
<value>/tmp/logs</value>
</property>
<property>
<description>Amount of physical memory, in MB, that can be allocated
for containers.</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
<property>
<description>Number of CPU cores that can be allocated
for containers.</description>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>2</value>
</property>
<property>
<description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
8、配置slaves
hadoop2
hadoop3
hadoop4
9、配置fairscheduler.xml
<?xml version="1.0"?>
<allocations>
<queue name="common">
<minResources>0mb, 0 vcores </minResources>
<maxResources>6144 mb, 6 vcores </maxResources>
<maxRunningApps>50</maxRunningApps>
<minSharePreemptionTimeout>300</minSharePreemptionTimeout>
<weight>1.0</weight>
<aclSubmitApps>grid</aclSubmitApps>
</queue>
</allocations>
10、同步配置文件到各个节点
$cd /opt/hadoop/etc
$scp -r hadoop hadoop2:/opt/hadoop/etc/
$scp -r hadoop hadoop3:/opt/hadoop/etc/
$scp -r hadoop hadoop4:/opt/hadoop/etc/
三、启动集群(格式化文件系统)
1、建立环境变量
$vi ~/.bash_profile
export HADOOP_HOME=/opt/hadoop
export YARN_HOME_DIR=/opt/hadoop
export HADOOP_CONF_DIR=/opt/hadoop/etc/hadoop
export YARN_CONF_DIR=/opt/hadoop/etc/hadoop
2、启动HDFS
先启动journalnode,在hadoop2~hadoop4上:
$cd /opt/hadoop/
$sbin/hadoop-daemon.sh start journalnode
格式化HDFS,然后启动namenode。在hadoop1上:
$bin/hdfs namenode -format
$sbin/hadoop-daemon.sh start namenode
同步另一个namenode,并启动。在hadoop2上:
$bin/hdfs namenode -bootstrapStandby
$sbin/hadoop-daemon.sh start namenode
此时两个namenode都是standby状态,将hadoop1切换成active(hadoop1在hdfs-site.xml里对应的是nn1):
$bin/hdfs haadmin -transitionToActive nn1
启动datanode。在hadoop1上(active的namenode):
$sbin/hadoop-daemons.sh start datanode
注意事项:后续启动,只需使用sbin/start-dfs.sh即可。但由于没有配置zookeeper的failover,所以只能HA只能使用手工切换。所以每次启动HDFS,都要执行$bin/hdfs haadmin -transitionToActive nn1来使hadoop1的namenode变成active状态
2、启动yarn
在hadoop1上(resourcemanager):
$sbin/start-yarn.sh
————————————————————————————————————————————
以上配置的HDFS HA并不是自动故障切换的,如果配置HDFS自动故障切换,需要添加以下步骤(先停掉集群):
1、部署zookeeper,步骤省略。部署在hadoop2、hadoop3、hadoop4,并启动
2、在hdfs-site.xml中添加:
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/exampleuser/.ssh/id_rsa</value>
/property>
解释详见官方文档。这样配置设定了fencing方法是通过ssh去关闭前一个活动节点的端口。前提前两个namenode能互相SSH。
还有另外一种配置方法:
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>shell(/path/to/my/script.sh arg1 arg2 ...)</value>
</property>
这样的配置实际上是使用shell来隔绝端口和程序。如果不想做实际的动作,dfs.ha.fencing.methods可配置成shell(/bin/true)
3、在core-site.xml中添加:
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop2:2181,hadoop3:2181,hadoop4:2181</value>
</property>
4、初始化zkfc(在namenode上执行)
bin/hdfs zkfc -formatZK
5、启动集群
___________________________________________________________________________________________________
zkfc:每个namenode上都运行,是zk的客户端,负责自动故障切换
zk:奇数个节点,维护一致性锁、负责选举活动节点
joural node:奇数个节点,用于active和standby节点之间数据同步。活动节点把数据写入这些节点,standby节点读取
————————————————————————————————————————————
更改成resourcemanager HA:
选择hadoop2作为另一个rm节点
1、设置hadoop2对其它节点作互信
2、编译yarn-site.xml并同步到其它机器
3、复制fairSheduler.xml到hadoop2
4、启动rm
5、启动另一个rm
配置文件
1.core配置:
[qujian@master hadoop]$ vim core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/qujian/hadoop-2.7.2/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master.hadoop.cn:2181,second1.hadoop.cn:2181,second2.hadoop.cn:2181</value>
</property>
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>1000</value>
</property>
修改mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
修改yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master.hadoop.cn:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master.hadoop.cn:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master.hadoop.cn:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master.hadoop.cn:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master.hadoop.cn:8088</value>
</property>
修改hdfs-site.xml
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/qujian/hadoop-2.7.2/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/qujian/hadoop-2.7.2/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>master.hadoop.cn:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>second1.hadoop.cn:9000</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.mycluster.nn1</name>
<value>master.hadoop.cn:53310</value>
</property>
<property>
<name>dfs.namenode.servicerpc-address.mycluster.nn2</name>
<value>second1.hadoop.cn:53310</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>master.hadoop.cn:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>second1.hadoop.cn:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://second2.hadoop.cn:8485;data1.hadoop.cn:8485;data2.hadoop.cn:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/qujian/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/qujian/hadoop-2.7.2/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>ha.failover-controller.cli-check.rpc-timeout.ms</name>
<value>60000</value>
</property>
<property>
<name>ipc.client.connect-timeout</name>
<value>60000</value>
</property>
<property>
<name>dfs.image.transfer.bandwidthPerSec</name>
<value>4194304</value>
</property>
配置data服务器:
[qujian@master hadoop]$ cat slaves
second2.hadoop.cn
data1.hadoop.cn
data2.hadoop.cn
for n in second1.hadoop.cn second2.hadoop.cn data1.hadoop.cn data2.hadoop.cn
do
scp -rp /home/qujian/hadoop-2.7.2 $n:~/
wait
done
四台机器 bei1 bei2 bei3 bei4
NN | DN | ZK | ZKFC | JN | RM | NM(任务管理) |
|
bei1 | Y | Y | Y | ||||
bei2 | Y | Y | Y | Y | Y |
Y | Y |
bei3 | Y | Y | Y | Y | |||
bei4 | Y | Y | Y |
1、升级组件以及关闭防火墙
yum -y update
PS: 如果使用本地yum源可省略该项
新开终端在升级组件的同时操作减少等待时间
# service iptables stop
# chkconfig iptables off
2、修改/etc/hosts文件中IP与主机映射关系
# vi /etc/hosts
192.168.31.131 bei1
192.168.31.132 bei2
192.168.31.133 bei3
192.168.31.134 bei4
3、如果是虚拟机修改/etc/sysconfig/network-scripts/ifcfg-eth0删除UUID和MAC地址
# vi /etc/sysconfig/network-scripts/ifcfg-eth0
4、删除/etc/udev/rules.d/70-persistent-net.rules 默认网卡MAC生成规则文件
# rm -rf /etc/udev/rules.d/70-persistent-net.rules
PS:如果是其它NODE节点不是虚拟机克隆或者源虚拟机复制的可省略第3、4两项
5、yum升级后重启主机
6、准备环境
6.1、yum -y install gcc gcc-c++ autoconf automake cmake ntp rsync ssh vim
yum -y install zlib zlib-devel openssl openssl-devel pcre-devel
PS:以上一些程序可能对于hadoop并不需要但为了以后安装其它程序可能会用到尤其是源码安装
其中重要的三个程序是必须安装的
ssh 用于节点间通信 我选用的是CentOS6.7的版本默认已经安装了openssh
rsync 用于远程同步
ntp 用于时间同步
6.2、当6.1中第一个yum安装完成后新开终端进行NTP时间同步该项很重要
6.2.1 配置ntp启动项
chkconfig ntpd on
6.2.2 同步时间
ntpdate ntp.sjtu.edu.cn
6.2.3 启动ntpd服务
/etc/init.d/ntpd start
6.2.4 验证ntp服务已经运行
pgrep ntpd
6.2.5 初始同步
ntpdate -u ntp.sjtu.edu.cn
6.2.6 确认同步成功
ntpq -p
PS:可以一次性输入以上命令
chkconfig ntpd on
ntpdate ntp.sjtu.edu.cn
/etc/init.d/ntpd start
pgrep ntpd
ntpdate -u ntp.sjtu.edu.cn
ntpq -p
等待6.2.1yum成功后建议重启主机
7、安装jdk
7.1 将jdk考到家目录中
7.2 rpm -ivh jdk_xxxxxxxx.rpm
7.3 jdk安装目录默认为/usr/java/jdk1.7.0_79
7.4 配置jdk环境变量
# vim ~/.bash_profile
增加以下四行
export JAVA_HOME=/opt/sxt/soft/jdk1.7.0_80
export PATH=$PATH:$JAVA_HOME/bin
export HADOOP_HOME=/opt/sxt/soft/hadoop-2.5.1
export PATH=$PATH:HADOOP_HOME/bin:$HADOOP_HOME/sbin
编辑完成后使用source命令使文件~/.bash_profile生效执行以下命令
source ~/.bash_profile
检查环境变量
printenv
8、安装tomcat (这步可省略,不过以后肯定有用)
将tomcat拷贝到/opt/sxt下解压
# tar -zxvf apache-tomcat-xxxxx.tar.gz
9、将Hadoop 上传到/opt/sxt
# tar -zxvf hadoop-2.5.1_x64.tar.gz
9.1 创建hadoop.tmp.dir目录及创建
# mkdir -p /opt/hadooptmp
9.2 etc/hadoop/core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://bjsxt</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>bei1:2181,bei2:2181,bei3:2181</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadooptmp</value>
<!-- 临时文件地址 -->
</property>
9.3 etc/hadoop/hdfs-site.xml:
<property>
<name>dfs.nameservices</name>
<value>bjsxt</value>
</property>
<property>
<name>dfs.ha.namenodes.bjsxt</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.bjsxt.nn1</name>
<value>bei1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.bjsxt.nn2</name>
<value>bei2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.bjsxt.nn1</name>
<value>bei1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.bjsxt.nn2</name>
<value>bei2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://bei2:8485;bei3:8485;bei4:8485/bjsxt</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.bjsxt</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_dsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadooptmp/data</value>
<!-- jn 临时文件地址 -->
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
9.4 克隆
9.5 修改主机名 IP 网关 mac
修改主机名
vim /etc/sysconfig/network
修改IP地址
vi /etc/sysconfig/network-scripts/ifcfg-eth0
修改DNS
vi /etc/resolv.conf 中的search ,nameserver
10、检查ssh本地免密码登录
10.1 第一次检查
ssh localhost
PS:远程成功后记得exit退出
10.2 创建本地秘钥并将公共秘钥写入认证文件
# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
# cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
10.3 再次检查
ssh localhost
PS:同样exit退出
10.4 在NameNode上将~/.ssh/authorized_keys文件复制到各节点上
scp ~/.ssh/authorized_keys root@hadoopsnn:~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys root@hadoopdn1:~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys root@hadoopdn2:~/.ssh/authorized_keys
10.5 编写/opt/sxt/soft/hadoop-2.5.1/etc/hadoop/hadoop-env.sh文件默认 hadoop取不到用户环境变量里的JAVA_HOME所以要手动指定
vim /opt/sxt/soft/hadoop-2.5.1/etc/hadoop/hadoop-env.sh
找到export JAVA_HOME=${JAVA_HOME}
修改为export JAVA_HOME=/opt/sxt/soft/jdk1.7.0_80
增加以下一行
export HADOOP_PREFIX=/opt/sxt/soft/hadoop-2.5.1
11、 配置安装zookeeper
11.1 三台zookeeper:bei1,bei2,bei3
11.2 编辑zoo.cfg配置文件
修改dataDir=/opt/sxt/zookeeperdatadir
tickTime=2000
dataDir=/opt/sxt/zookeeperdatadir
clientPort=2181
initLimit=5
syncLimit=2
server.1=bei1:2888:3888
server.2=bei2:2888:3888
server.3=bei3:2888:3888
11.3 在dataDir目录中创建一个myid的文件,文件内容为1,2,3
12、配置hadoop中的slaves 其中放置的是NN
*******这一步开始要认真按步骤做,若修改配置文件了,服务需要重启*******
13、启动三个zookeeper:/opt/sxt/zookeeper-3.4.6/bin/zkServer.sh start
14、启动三个JournalNode:./hadoop-daemon.sh start journalnode
15、在其中一个namenode上格式化:bin/hdfs namenode -format
16、把刚刚格式化之后的元数据拷贝到另外一个namenode上
16.1启动刚刚格式化的namenode :hadoop-daemone.sh start namenode
16.2在没有格式化的namenode上执行:hdfs namenode -bootstrapStandby
16.3启动第二个namenode
17、在其中一个namenode上初始化zkfc:hdfs zkfc -formatZK
18、停止上面节点:stop-dfs.sh
19、全面启动:start-dfs.sh
20、登录页面jps检查 登录页面检查
搭建准备
1、下载安装包
Hadoop: wget http://apache.fayea.com/hadoop/common/hadoop-2.7.1/hadoop-2.7.1.tar.gz
Hbase: wget http://apache.fayea.com/hbase/1.1.4/hbase-1.1.4-bin.tar.gz
2、JDK安装与配置。
3、SSH免密码登录。参考: SSH免密码登录
4、zookeeper-3.4.7安装与配置。 参考: Debian环境——ZooKeeper集群安装配置
创建用户
1、以root用户登录。
2、创建用户组hadoop 命令: groupadd hadoop
3、创建hadoop用户: sudo useradd -s /bin/bash -d /home/hadoop -m hadoop -g hadoop -G root
4、修改密码: passwd hadoop 。 根据提示两次输入需要设置的密码。
5、root用户跳转到hadoop 用户命令: su hadoop
6、跳转到hadoop 的根路径: cd
7、配置环境变量, 编辑命令: vim .bashrc
[java] view plain
alias ll='ls -l'
export JAVA_HOME=/usr/local/jdk1.7.0_80/
export JRE_HOME=/usr/local/jdk1.7.0_80/jre
export CLASSPATH=.:$CLASSPATH:$JAVA_HOME/lib:$JRE_HOME/lib
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
8、保存退出。 使配置生效命令: source .bashrc 。
Hadoop HA搭建
1、解压: tar -zxvf hadoop-2.7.1.tar.gz
2、进入: cd hadoop-2.7.1/etc/hadoop/
3、进入配置文件目录后, 配置如下文件。
(1) hadoop-env.sh 。该文件中配置JAVA_HOME的路径:
export JAVA_HOME=/usr/local/jdk1.7.0_80/
(2)core-site.xml 。推荐配置:
[java] view plain
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hdp1:2181,hdp2:2181,hdp3:2181</value>
</property>
</configuration>
(3)hdfs-site.xml 。 推荐配置:
[java] view plain
<configuration>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/tmp/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/tmp/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>ns</value>
</property>
<property>
<name>dfs.ha.namenodes.ns</name>
<value>hdp1,hdp2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns.hdp1</name>
<value>hdp1:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns.hdp1</name>
<value>hdp1:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns.hdp2</name>
<value>hdp2:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.ns.hdp2</name>
<value>hdp2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hdp3:8485;hdp4:8485/ns</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.ns</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/tmp/journal</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/mars/.ssh/id_rsa</value>
</property>
</configuration>
(4)mapred-site.xml 。 推荐配置:
[java] view plain
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(5)yarn-site.xml 。 推荐配置:
[java] view plain
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hdp1</value>
<!-- resourcemanager在os1上 -->
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
(6)slaves 。 文件配置(数据节点的配置):
[java] view plain
hdp1
hdp2
hdp3
hdp4
注意: 这里hosts中已经配置:
[java] view plain
172.16.1.227 hdp1
172.16.1.228 hdp2
172.16.1.229 hdp3
172.16.1.230 hdp4
其中: zookeeper集群: hdp1,hdp2,hdp3 。
4、拷贝。 将hadoop-2.7.1拷贝到其他节点上去。
使用命令形如: scp -r hadoop-2.7.1 hadoop@172.16.1.226:/home/hadoop/
5、Zookeeper集群上建立HA
hdp1的hadoop-2.7.1/bin路径下(手输入)命令: ./hdfs zkfc -formatZK
包含成功信息如下: Successfully created /hadoop-ha/ns in ZK.
6、启动journalnode。
根据hdfs-site.xml配置, 启动hdp3, hdp4上的journalnode。
在其各自节点上hadoop-2.7.1/sbin路径下执行如下命令:
./hadoop-daemon.sh start journalnode
输入 jps 命令查看启动的 JournalNode 进程。
7、格式化NameNode。
(1)在hdp1的hadoop-2.7.1/bin执行命令:
./hdfs namenode -format -clusterId ss
(2)格式化成功包含如下信息: Storage directory /home/hadoop/tmp/name has been successfully formatted.
(3)将该命令生成的信息拷贝到hdp2上(hdp2)将作为备用的NameNode。
(4)执行命令:
scp -r /home/hadoop/ root@hdp2:/home/hadoop/ (Namenode信息路径见配置文件hdfs-site.xml的dfs.name.dir) 。
8、进入主节点hdp1,启动hadoop集群。
进入hadoop-2.7.1/sbin/路径启动hadoop集群, 使用命令:
./start-all.sh
9、启动验证。
(1)使用jps命运, 查看各个节点的进程。
主节点有: NameNode 和 ResourceManager 等进程; 从节点有: DataNode 和 NodeManager 进程;
(2)浏览器访问,两个namenode的网页
http://hdp1:50070/
http://hdp2:50070/
此时可以看到两个namenode都处于standby 状态; Datanodes菜单项下,可以查看datanode列表。
10、激活NameNode 。
先后在hdp1, hdp2两个namenode的hadoop/sbin/执行命令:
./hadoop-daemon.sh start zkfc ; 貌似哪个节点先执行, 那个节点就最新处于alive状态;
jps可查看到DFSZKFailoverController进程。
简单测试
进入hadoop-2.7.1/bin/路径下执行命令
1、显示hdfs根路径: ./hadoop fs -ls /
2、本地创建一个txt文件, 上传文件到hdfs: ./hadoop fs -put test.txt /
3、查看已经上传的文件: ./hadoop fs -text /test.txt
4、若能正常打印文件中的内容, 说明简单测试通过。
环境变量配置
1、编辑环境变量文件: vim .bashrc
2、配置。
(1)PATH之前新增: export HADOOP_HOME=/home/hadoop/hadoop-2.7.1
(2)PATH新增: :$HADOOP_HOME/bin:$HADOOP_HOME/sbin
3、保存退出。 使配置文件生效: source .bashrc
hadoop-daemon.sh与hadoop-daemons.sh区别
hadoop-daemon.sh只能本地执行
hadoop-daemons.sh能远程执行
1. 启动JN
hadoop-daemons.sh start journalnode
hdfs namenode -initializeSharedEdits
//复制edits log文件到journalnode节点上,第一次创建得在格式化namenode之后使用
http://hadoop-yarn1:8480来看journal是否正常
2.格式化namenode,并启动Active Namenode
一、Active NameNode节点上格式化namenode
hdfs namenode -format
hdfs namenode -initializeSharedEdits
初始化journalnode完毕
二、启动Active Namenode
hadoop-daemon.sh start namenode
3.启动 Standby namenode
一、Standby namenode节点上格式化Standby节点
复制Active Namenode上的元数据信息拷贝到Standby Namenode节点上
1
hdfs namenode -bootstrapStandby
二、启动Standby节点
hadoop-daemon.sh start namenode
4.启动Automatic Failover
在zookeeper上创建 /hadoop-ha/ns1这样一个监控节点(ZNode)
hdfs zkfc -formatZK
start-dfs.sh
5.查看namenode状态
hdfs haadmin -getServiceState nn1
active
6.自动failover
hdfs haadmin -failover nn1 nn2
配置文件详细信息
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/modules/hadoop-2.2.0/data/tmp</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>60*24</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-yarn1:2181,hadoop-yarn2:2181,hadoop-yarn3:2181</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>yuanhai</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>hadoop-yarn1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>hadoop-yarn2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>hadoop-yarn1:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>hadoop-yarn2:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-yarn1:8485;hadoop-yarn2:8485;hadoop-yarn3:8485/ns1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/modules/hadoop-2.2.0/data/tmp/journal</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<!-- <property>
<name>dfs.namenode.http-address</name>
<value>hadoop-yarn.dragon.org:50070</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>hadoop-yarn.dragon.org:50090</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/name</value>
</property>
<property>
<name>dfs.namenode.edits.dir</name>
<value>${dfs.namenode.name.dir}</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/data</value>
</property>
<property>
<name>dfs.namenode.checkpoint.dir</name>
<value>file://${hadoop.tmp.dir}/dfs/namesecondary</value>
</property>
<property>
<name>dfs.namenode.checkpoint.edits.dir</name>
<value>${dfs.namenode.checkpoint.dir}</value>
</property>
-->
</configuration>
slaves
hadoop-yarn1
hadoop-yarn2
hadoop-yarn3
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-yarn1</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop-yarn1:10020</value>
<description>MapReduce JobHistory Server IPC host:port</description>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop-yarn1:19888</value>
<description>MapReduce JobHistory Server Web UI host:port</description>
</property>
<property>
<name>mapreduce.job.ubertask.enable</name>
<value>true</value>
</property>
</configuration>
hadoop-env.sh
export JAVA_HOME=/opt/modules/jdk1.6.0_24