hadoop-2.6.4集群编译搭建-阿里云和腾讯云-阿里云开发者社区

开发者社区> lizer2016> 正文

hadoop-2.6.4集群编译搭建-阿里云和腾讯云

简介: 腾讯云阿里云 hadoop集群编译搭建 环境准备 阿里云配置: [hadoop@lizer_ali ~]$ uname -a Linux lizer_ali 2.
+关注继续查看

腾讯云阿里云 hadoop集群编译搭建

环境准备

阿里云配置:

[hadoop@lizer_ali ~]$ uname -a  
Linux lizer_ali 2.6.32-573.22.1.el6.x86_64 #1 SMP Wed Mar 23 03:35:39 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[hadoop@lizer_ali ~]$ head -n 1 /etc/issue
CentOS release 6.5 (Final)
[hadoop@lizer_ali ~]$ cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c 
      1  Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
[hadoop@lizer_ali ~]$ getconf LONG_BIT 
64
[hadoop@lizer_ali ~]$ cat /proc/meminfo 
MemTotal:        1018508 kB
MemFree:          353912 kB

腾讯云配置:

[hadoop@lizer_tx ~]$ uname -a  
Linux lizer_tx 2.6.32-573.18.1.el6.x86_64 #1 SMP Tue Feb 9 22:46:17 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
[hadoop@lizer_tx ~]$ head -n 1 /etc/issue
CentOS release 6.7 (Final)
[hadoop@lizer_tx ~]$ cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c 
      1  Intel(R) Xeon(R) CPU E5-26xx v3
[hadoop@lizer_tx ~]$ getconf LONG_BIT 
64
[hadoop@lizer_tx ~]$ cat /proc/meminfo 
MemTotal:        1020224 kB
MemFree:          688488 kB

创建用户

useradd hadoop
passwd haddop

jdk1.7安装:

下载:http://www.oracle.com/technetwork/java/javase/downloads/java-archive-downloads-javase7-521261.html#jdk-7u80-oth-JPR

wget http://download.oracle.com/otn/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz?AuthParam=1469844164_7ce09e1f99570835183215c3510e95e0

mv jdk-7u80-linux-x64.tar.gz\?AuthParam\=1469844164_7ce09e1f99570835183215c3510e95e0 jdk-7u80-linux-x64.tar.gz

配置jdk
tar zxf jdk-7u80-linux-x64.tar.gz -C /opt/

配置环境变量

vim /etc/profile
export JAVA_HOME=/opt/jdk1.7.0_80
export JRE_HOME=/opt/jdk1.7.0_80/jre
export CLASSPATH=./$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

生效:
source /etc/profile

编译hadoop2.6.4所需软件

yum install gcc cmake gcc-c++

安装maven

wget http://www-eu.apache.org/dist/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz
安装maven:http://www.blogjava.net/caojianhua/archive/2011/04/02/347559.html

tar zxf apache-maven-3.3.9-bin.tar.gz -C /usr/local/
vim /etc/profile
export MAVEN_HOME=/usr/local/apache-maven-3.3.9
export PATH=$PATH:$MAVEN_HOME/bin

source /etc/profile

[root@lizer_ali hadoop]# mvn -v
Apache Maven 3.3.9 (bb52d8502b132ec0a5a3f4c09453c07478323dc5; 2015-11-11T00:41:47+08:00)
Maven home: /usr/local/apache-maven-3.3.9
Java version: 1.7.0_80, vendor: Oracle Corporation
Java home: /opt/jdk1.7.0_80/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "2.6.32-573.22.1.el6.x86_64", arch: "amd64", family: "unix"

安装protobuf

要求版本protobuf-2.5.0
wget https://github.com/google/protobuf/releases/download/v2.5.0/protobuf-2.5.0.tar.gz

cd protobuf-2.5.0/
./configure -prefix=/usr/local/protobuf-2.5.0
make && make install

vim /etc/profile
export PROTOBUF=/usr/local/protobuf-2.5.0
export PATH=$PROTOBUF/bin:$PATH

protoc --version

安装ant

wget http://www-eu.apache.org/dist//ant/binaries/apache-ant-1.9.7-bin.tar.gz

tar zxf apache-ant-1.9.7-bin.tar.gz -C /usr/local/
vim /etc/profile
export ANT_HOME=/usr/local/apache-ant-1.9.7
export PATH=$PATH:$ANT_HOME/bin

source /etc/profile

ant -version
Apache Ant(TM) version 1.9.7 compiled on April 9 2016

yum install autoconf automake libtool
yum install openssl-devel

安装findbugs

http://findbugs.sourceforge.net/downloads.html

wget http://prdownloads.sourceforge.net/findbugs/findbugs-3.0.1.tar.gz?download
mv findbugs-3.0.1.tar.gz\?download findbugs-3.0.1.tar.gz

tar zxf findbugs-3.0.1.tar.gz -C /usr/local/
vim /etc/profile
export FINDBUGS_HOME=/usr/local/findbugs-3.0.1
export PATH=$FINDBUGS_HOME/bin:$PATH

findbugs -version

hadoop编译安装:

下载hadoop:http://hadoop.apache.org/releases.html

wget http://www-eu.apache.org/dist/hadoop/common/hadoop-2.6.4/hadoop-2.6.4-src.tar.gz

tar zxf hadoop-2.6.4-src.tar.gz
cd hadoop-2.6.4-src

more BUILDING.txt
查看如何编译安装

mvn clean package -Pdist,native,docs -DskipTests -Dtar
编译过程中,需要下载很多包,等待时间比较长。当看到hadoop各个项目都编译成功,即出现一系列的SUCCESS之后,即为编译成功。

有些包下载卡住,重复执行上面的命令,或可以根据提示到相应的网址(https://repo.maven.apache.org/maven2)下载放到指定位置

出现错误1:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:1.7:run (site) on project hadoop-common: An Ant BuildException has occured: input file /home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-common/target/findbugsXml.xml does not exist
[ERROR] around Ant part ...<xslt style="/usr/local/findbugs-3.0.1/src/xsl/default.xsl" in="/home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-common/target/findbugsXml.xml" out="/home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-common/target/site/findbugs.html"/>... @ 44:256 in /home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-common/target/antrun/build-main.xml

解决办法1:
参考:http://www.itnose.net/detail/6143808.html
从该命令删除docs参数再运行mvn package -Pdist,native -DskipTests -Dtar

出现错误2:
[INFO] Executing tasks
main:
[mkdir] Created dir: /home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-kms/downloads
[get] Getting: http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.41/bin/apache-tomcat-6.0.41.tar.gz
[get] To: /home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-kms/downloads/apache-tomcat-6.0.41.tar.gz

解决2:
卡在这里,应该是不能下载
下载上传到指定位置

出现错误3:
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:2.8.1:jar (module-javadocs) on project hadoop-hdfs: MavenReportException: Error while creating archive:
[ERROR] ExcludePrivateAnnotationsStandardDoclet
[ERROR] Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000000f31a4000, 130400256, 0) failed; error='Cannot allocate memory' (errno=12)
[ERROR] #
[ERROR] # There is insufficient memory for the Java Runtime Environment to continue.
[ERROR] # Native memory allocation (malloc) failed to allocate 130400256 bytes for committing reserved memory.
[ERROR] # An error report file with more information is saved as:
[ERROR] # /home/hadoop/hadoop-2.6.4-src/hadoop-hdfs-project/hadoop-hdfs/target/hs_err_pid24729.log
[ERROR]
[ERROR] Error occurred during initialization of VM, try to reduce the Java heap size for the MAVEN_OPTS environnement variable using -Xms:<size> and -Xmx:<size>.
[ERROR] Or, try to reduce the Java heap size for the Javadoc goal using -Dminmemory=<size> and -Dmaxmemory=<size>.

解决3:
应该是内存不够,没有分配swap
添加2G swap分区
添加或扩大交换分区
dd if=/dev/zero of=/home/swap bs=512 count=4096000
bs 是扇区大小 bs=512 指大小为512B count为扇区数量
表示创建一个大小为4G 的文件 /home/swap 用空值填充。of位置可以自己调整。

查看当前分区的大小
free -m

格式化并挂载
mkswap /home/swap
swapon /home/swap

查看挂载情况
swapon -s

开机自动挂载
vim /etc/fstab
/home/swap swap swap defaults 0 0

想写在分区
swapoff /home/swap

出现问题4:
main:
[mkdir] Created dir: /home/hadoop/hadoop-2.6.4-src/hadoop-hdfs-project/hadoop-hdfs-httpfs/downloads
[get] Getting: http://archive.apache.org/dist/tomcat/tomcat-6/v6.0.41/bin/apache-tomcat-6.0.41.tar.gz
[get] To: /home/hadoop/hadoop-2.6.4-src/hadoop-hdfs-project/hadoop-hdfs-httpfs/downloads/apache-tomcat-6.0.41.tar.gz

解决4:
网络问题,同上
cp /home/hadoop/hadoop-2.6.4-src/hadoop-common-project/hadoop-kms/downloads/apache-tomcat-6.0.41.tar.gz /home/hadoop/hadoop-2.6.4-src/hadoop-hdfs-project/hadoop-hdfs-httpfs/downloads/

编译安装成功

[INFO] Apache Hadoop Gridmix .............................. SUCCESS [  6.239 s]
[INFO] Apache Hadoop Data Join ............................ SUCCESS [  4.070 s]
[INFO] Apache Hadoop Ant Tasks ............................ SUCCESS [  3.304 s]
[INFO] Apache Hadoop Extras ............................... SUCCESS [  4.653 s]
[INFO] Apache Hadoop Pipes ................................ SUCCESS [  8.279 s]
[INFO] Apache Hadoop OpenStack support .................... SUCCESS [  7.736 s]
[INFO] Apache Hadoop Amazon Web Services support .......... SUCCESS [06:22 min]
[INFO] Apache Hadoop Client ............................... SUCCESS [  9.608 s]
[INFO] Apache Hadoop Mini-Cluster ......................... SUCCESS [  0.258 s]
[INFO] Apache Hadoop Scheduler Load Simulator ............. SUCCESS [  6.721 s]
[INFO] Apache Hadoop Tools Dist ........................... SUCCESS [ 15.171 s]
[INFO] Apache Hadoop Tools ................................ SUCCESS [  0.022 s]
[INFO] Apache Hadoop Distribution ......................... SUCCESS [ 37.343 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 31:45 min
[INFO] Finished at: 2016-07-30T23:59:50+08:00
[INFO] Final Memory: 101M/241M
[INFO] ------------------------------------------------------------------------

hadoop-dist/target/已经生成了可执行文件

拷贝到用户目录
cp -r hadoop-2.6.4 ~/

配置

两台机免密互通

ssh-keygen
各自生成的公钥放到另一台机上
cat id_rsa_else.pub >> authorized_keys
chmod 600 authorized_keys

网络规划:

hadoop1 123.206.33.182 slave
hadoop0 114.215.92.77 master

配置hosts

vim /etc/hosts
123.206.33.182 hadoop1 tx lizer_tx
114.215.92.77 hadoop0 ali lizer_ali

配置环境变量

vim /etc/profile
export HADOOP_HOME=/home/hadoop/hadoop-2.6.4
export PATH=$HADOOP_HOME/bin:$PATH

Hadoop配置

配置文件放在$HADOOP_HOME/etc/hadoop/
修改一下配置:

vim hadoop-env.sh
export JAVA_HOME=/opt/jdk1.7.0_80

vim yarn-env.sh
export JAVA_HOME=/opt/jdk1.7.0_80

vim slaves (这里没有了master配置文件)
hadoop1

vim core-site.xml

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/home/hadoop/hadoop/tmp</value>
                <description>A base for other temporary directories.</description>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://hadoop0:9000</value>
        </property>
        <property>
                <name>io.file.buffer.size</name>
                <value>4096</value>
        </property>
</configuration>

vim hdfs-site.xml

<configuration>
        <property>
                <name>dfs.http.address</name>
                <value>hadoop0:50070</value>
        </property>
        <property>
                <name>dfs.secondary.http.address</name>
                <value>hadoop0:50090</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>/home/hadoop/hadoop/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/home/hadoop/hadoop/data</value>
        </property>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.nameservices</name>
                <value>hadoop0</value>
        </property>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>hadoop0:50090</value>
        </property>
        <property>
                <name>dfs.webhdfs.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.permissions</name>
                <value>false</value>
        </property>
</configuration>

cp mapred-site.xml.template mapred-site.xml

vim /etc/mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
                <final>true</final>
        </property>
        <property>
                <name>mapreduce.jobtracker.http.address</name>
                <value>hadoop0:50030</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>hadoop0:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>hadoop0:19888</value>
        </property>
        <property>
                <name>mapred.job.tracker</name>
                <value>hadoop0:9001</value>
        </property>
</configuration>

vim yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>hadoop0</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                <name>yarn.resourcemanager.address</name>
                <value>hadoop0:8032</value>
        </property>
        <property>
                <name>yarn.resourcemanager.scheduler.address</name>
                <value>hadoop0:8030</value>
        </property>
        <property>
                <name>yarn.resourcemanager.resource-tracker.address</name>
                <value>hadoop0:8031</value>
        </property>
        <property>
                <name>yarn.resourcemanager.admin.address</name>
                <value>hadoop0:8033</value>
        </property>
        <property>
                <name>yarn.resourcemanager.webapp.address</name>
                <value>hadoop0:8088</value>
        </property>
</configuration>

vim master
hadoop0

scp -r /home/hadoop/hadoop-2.6.4/etc/hadoop/* tx:~/hadoop-2.6.4/etc/hadoop/

启动和关闭hadoop

参考:http://my.oschina.net/penngo/blog/653049

bin/hdfs namenode -format
sbin/start-dfs.sh
sbin/stop-dfs.sh
sbin/start-yarn.sh
sbin/stop-yarn.sh
sbin/mr-jobhistory-daemon.sh start historyserver
sbin/mr-jobhistory-daemon.sh stop historyserver
sbin/hadoop-daemon.sh start secondarynamenode
sbin/hadoop-daemon.sh stop secondarynamenode

[hadoop@lizer_ali hadoop-2.6.4]$ jps
3099 ResourceManager
3430 SecondaryNameNode
2879 NameNode
3470 Jps
3382 JobHistoryServer

[hadoop@lizer_tx ~]$ jps
9757 DataNode
9853 NodeManager
10064 Jps

检查节点配置情况
bin/hadoop dfsadmin -report

网页节点管理
http://114.215.92.77:8088/cluster

网页资源管理
http://114.215.92.77:50070/dfshealth.html#tab-overview

新建文件夹

bin/hdfs dfs -mkdir -p input

参考文档:

http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-common/ClusterSetup.html

http://hadoop.apache.org/docs/r2.6.4/hadoop-project-dist/hadoop-common/SingleCluster.html

版权声明:本文内容由阿里云实名注册用户自发贡献,版权归原作者所有,阿里云开发者社区不拥有其著作权,亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容,填写侵权投诉表单进行举报,一经查实,本社区将立刻删除涉嫌侵权内容。

相关文章
阿里云服务器怎么设置密码?怎么停机?怎么重启服务器?
如果在创建实例时没有设置密码,或者密码丢失,您可以在控制台上重新设置实例的登录密码。本文仅描述如何在 ECS 管理控制台上修改实例登录密码。
10089 0
CentOS7 搭建Ambari-Server,安装Hadoop集群(一)
2017-07-05:修正几处拼写错误,之前没发现,抱歉! 第一次在cnblogs上发表文章,效果肯定不会好,希望各位多包涵。 编写这个文档的背景是月中的时候,部门老大希望我们能够抽时间学习一下Hadoop大数据方面的技术;给我的学习内容是通过Ambari安装Hadoop集群。
2342 0
hadoop集群管理系统搭建规划说明
Hadoop分布式集群环境搭建是每个入门级新手都非常头疼的事情,因为你可能花费了很久的时间在搭建运行环境,最终却不知道什么原因无法创建成功。但对新手来说,运行环境搭建不成功的概率还蛮高的。
1195 0
怎么设置阿里云服务器安全组?阿里云安全组规则详细解说
阿里云服务器安全组设置规则分享,阿里云服务器安全组如何放行端口设置教程
8478 0
Hadoop框架:集群模式下分布式环境搭建
三台Centos7服务搭建Hadoop集群模式
251 0
阿里云服务器如何登录?阿里云服务器的三种登录方法
购买阿里云ECS云服务器后如何登录?场景不同,阿里云优惠总结大概有三种登录方式: 登录到ECS云服务器控制台 在ECS云服务器控制台用户可以更改密码、更换系.
13892 0
hadoop 2.7.3 集群搭建遇到问题以及解决
以后会将碰到的hadoop相关的问题同步到这篇blog里面
2208 0
+关注
21
文章
0
问答
文章排行榜
最热
最新
相关电子书
更多
《2021云上架构与运维峰会演讲合集》
立即下载
《零基础CSS入门教程》
立即下载
《零基础HTML入门教程》
立即下载