Hadoop Installation on Linux

简介: <h3>1. Operation System</h3> <p>[hadoop@karei hadoop]$ uname -a<br> Linux karei 2.6.18-371.4.1.el5 #1 SMP Wed Jan 8 18:42:07 EST 2014 x86_64 x86_64 x86_64 GNU/Linux<br> [hadoop@karei hadoop]$</

1. Operation System

[hadoop@karei hadoop]$ uname -a
Linux karei 2.6.18-371.4.1.el5 #1 SMP Wed Jan 8 18:42:07 EST 2014 x86_64 x86_64 x86_64 GNU/Linux
[hadoop@karei hadoop]$

2. Cluster Info

[root@karei ~]# clustat

3. Install JDK, and set environment variables for Java

a. Install

rpm -ihv jdk-8u45-linux-i586.rpm

b. Set environment variables

export JAVA_HOME=/usr/java/jdk1.8.0_45
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

Note: 

User: vi ~/.bash_profile --> source ~/.bash_profile

System: vi /etc/profile --> source /etc/profile

4. Download Hadoop from http://mirror.netinch.com/pub/apache/hadoop/common/

5. Decompression installation  hadoop-2.7.0.tar.gz

a. gunzip hadoop-2.7.0.tar.gz

b. tar -xvf hadoop-2.7.0.tar.gz -C /usr/local/

c. cd /usr/local/

d. mv hadoop-2.7.0 hadoop

6. Set environment variables for Hadoop

export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
#export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native:$JAVA_LIBRARY_PATH
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
#export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
#export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib"
 

Note: 

User: vi ~/.bash_profile --> source ~/.bash_profile

System: vi /etc/profile --> source /etc/profile

7. Create user and group for Hadoop for all nodes of cluster

a. Create group

groupadd bigdata

b. Create user

adduser hadoop

c. Add user into group

usermod -a -G bigdata hadoop

d. Set password for user

passwd hadoop

e. chown Hadoop home directory

chown hadoop:bigdata /usr/local/hadoop

8. Set RSA as Hadoop user for all nodes of cluster

a. ssh-keygen -t rsa 
b. ssh-copy-id <username>@<remote-host>

9. Configure nodes of cluster at file /etc/hosts

[root@karei ~]# cat /etc/hosts
127.0.0.1 localhost.localdomain localhost
::1 localhost6.localdomain6 localhost6
100.99.167.53 gerra.bigdata.hadoop.net gerra
100.99.167.54 hemei.bigdata.hadoop.net hemei
100.99.167.55 karei.bigdata.hadoop.net karei
100.99.167.56 lephi.bigdata.hadoop.net lephi
[root@karei ~]#

10. Configure masters and slaves under directory $HADOOP_HOME/etc/hadoop

[root@karei hadoop]#
[root@karei hadoop]# cat $HADOOP_HOME/etc/hadoop/masters
karei
hemei
[root@karei hadoop]#
[root@karei hadoop]#
[root@karei hadoop]# cat $HADOOP_HOME/etc/hadoop/slaves
gerra
lephi

[root@karei hadoop]#

11. Configure Hadoop

a. Configure $HADOOP_HOME/etc/hadoop/core-site.xml

<?xml version="1.0"?>
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://karei:9000</value>
  </property>
  <property>
    <name>io.file.buffer.size</name>
    <value>131072</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>file:/usr/local/hadoop/tmp</value>
    <description>Abase for other temporary directories.</description>
  </property>
  <property>
    <name>hadoop.proxyuser.hadoop.hosts</name>
    <value>*</value>
  </property>
  <property>
    <name>hadoop.proxyuser.hadoop.groups</name>
    <value>*</value>
  </property>
</configuration>

b. Configure $HADOOP_HOME/etc/hadoop/yarn-site.xml

<?xml version="1.0"?>
<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
  <property>
    <name>yarn.resourcemanager.address</name>
    <value>karei:8032</value>
  </property>
  <property>
    <name>yarn.resourcemanager.scheduler.address</name>
    <value>hemei:8030</value>
  </property>
  <property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>karei:8031</value>
  </property>
  <property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>karei:8033</value>
  </property>
  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>karei:8088</value>
  </property>
</configuration>

c. Configure $HADOOP_HOME/etc/hadoop/mapred-site.xml

<?xml version="1.0"?>
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>karei:10020</value>
  </property>
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>karei:19888</value>
  </property>
</configuration>

d. Configure $HADOOP_HOME/etc/hadoop/hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
  <property>
    <name>dfs.namenode.secondary.http-address</name>
    <value>hemei:9001</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>file:/usr/local/hadoop/tmp/hdfs/datanode</value>
    <description>DataNode directory for storing data chunks.</description>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>file:/usr/local/hadoop/tmp/hdfs/namenode</value>
    <description>NameNode directory for namespace and transaction logs storage.</description>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
    <description>Number of replication for each chunk.</description>
  </property>
  <property>
    <name>dfs.webhdfs.enabled</name>
    <value>true</value>
  </property>
</configuration>

12. Format namenode

[root@karei hadoop]# ./bin/hadoop namenode -format

13. Start Hadoop

[hadoop@karei hadoop]$ ./sbin/start-all.sh 

This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
15/04/28 21:46:52 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [karei]
karei: namenode running as process 10985. Stop it first.
gerra: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-gerra.out
lephi: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hadoop-datanode-lephi.out
gerra: Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
gerra: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
lephi: Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
lephi: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Starting secondary namenodes [hemei]
hemei: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hadoop-secondarynamenode-hemei.out
hemei: Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so which might have disabled stack guard. The VM will try to fix the stack guard now.
hemei: It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
15/04/28 21:47:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-resourcemanager-karei.out
gerra: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-gerra.out
lephi: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-nodemanager-lephi.out


Note: Java HotSpot(TM) Server VM warning
Java HotSpot(TM) Server VM warning: You have loaded library /usr/local/hadoop/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.  
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. 

Solve: Set environment about HADOOP_COMMON_LIB_NATIVE_DIR and HADOOP_OPTS
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib"

Note: 

User: vi ~/.bash_profile --> source ~/.bash_profile

System: vi /etc/profile --> source /etc/profile

14. Check status

[hadoop@karei hadoop]$ jps

相关文章
|
分布式计算 资源调度 Hadoop
|
6月前
|
弹性计算 分布式计算 Hadoop
Linux(阿里云)安装Hadoop(详细教程+避坑)
Linux(阿里云)安装Hadoop(详细教程+避坑)
1474 3
|
6月前
|
分布式计算 Hadoop Linux
实验: 熟悉常用的Linux操作和Hadoop操作
实验: 熟悉常用的Linux操作和Hadoop操作
87 2
|
6月前
|
分布式计算 Hadoop Java
Hadoop【环境搭建 01】【hadoop-3.1.3 单机版】【Linux环境 腾讯云 CentOS Linux release 7.5.1804】【详细】
Hadoop【环境搭建 01】【hadoop-3.1.3 单机版】【Linux环境 腾讯云 CentOS Linux release 7.5.1804】【详细】
111 0
|
6月前
|
分布式计算 Hadoop Java
linux上面hadoop配置集群
linux上面hadoop配置集群
99 0
|
6月前
|
分布式计算 资源调度 Hadoop
在Linux系统上安装Hadoop的详细步骤
【1月更文挑战第4天】在Linux系统上安装Hadoop的详细步骤
789 0
|
6月前
|
分布式计算 Hadoop Java
hadoop系列——linux hadoop安装
hadoop系列——linux hadoop安装
140 0
|
6月前
|
分布式计算 Hadoop Java
Hadoop【部署 01】腾讯云Linux环境CentOS Linux release 7.5.1804单机版hadoop-3.1.3详细安装步骤(安装+配置+初始化+启动脚本+验证)
Hadoop【部署 01】腾讯云Linux环境CentOS Linux release 7.5.1804单机版hadoop-3.1.3详细安装步骤(安装+配置+初始化+启动脚本+验证)
147 0
|
分布式计算 Ubuntu Hadoop
基于Linux的Hadoop伪分布式安装
基于Linux的Hadoop伪分布式安装
230 0
|
分布式计算 Hadoop Linux
Linux重启Hadoop集群命令
Linux重启Hadoop集群命令
250 0