0. 项目背景
基于阿里云ECS云服务器进行搭建私有的大数据平台,采用Apache Hadoop生态,为大数据提供存储及处理。
1. 购买ECS云服务器实例
在这里,因为实验需要3个节点,所以我们购买3台ECS实例。
![b7a8ceb2e5d2009acfcc93352262ab22084eed9c](https://yqfile.alicdn.com/b7a8ceb2e5d2009acfcc93352262ab22084eed9c.png?x-oss-process=image/resize,w_1400/format,webp)
![8e8e45292577f6ddbdc43c6a4a11a54b899692fd](https://yqfile.alicdn.com/8e8e45292577f6ddbdc43c6a4a11a54b899692fd.png?x-oss-process=image/resize,w_1400/format,webp)
2. 远程登录服务器,进行基础环境的配置。
# 工欲善其事,必先利其器
# 前提准备
# 安装系统命令
yum -y install wget vim ntpdate net-tools ntpdate
2.1 节点信息
172.18.53.98 master
172.18.53.99 slave1
172.18.53.100 slave2
2.2 修改主机名,每一个节点都需要修改。
![76b34f7540a6724f100d02d7ede036d2931c1116](https://yqfile.alicdn.com/76b34f7540a6724f100d02d7ede036d2931c1116.png?x-oss-process=image/resize,w_1400/format,webp)
2.3 配置主机文件(每一个节点都需要执行)
vi /etc/hosts
172.18.53.98 master
172.18.53.99 slave1
172.18.53.100 slave2
![f8b35d502324f21f724b2d5f92a5dcca244e8fa8](https://yqfile.alicdn.com/f8b35d502324f21f724b2d5f92a5dcca244e8fa8.png?x-oss-process=image/resize,w_1400/format,webp)
2.4 系统防火墙和内核防火墙配置(每一个节点都需要执行)
# 临时关闭内核防火墙
setenforce 0
# 永久关闭内核防火墙
vi /etc/selinux/config
SELINUX=disabled
![0bd06bcd70b17d1ae79cc6019b69e6b5f1ee2c4f](https://yqfile.alicdn.com/0bd06bcd70b17d1ae79cc6019b69e6b5f1ee2c4f.png?x-oss-process=image/resize,w_1400/format,webp)
# 临时关闭系统防火墙
systemctl stop firewalld.service
# 永久关闭内核防火墙
systemctl disable firewalld.service
![6a67f26ade4ee32271cb584c7993fdd686fdedea](https://yqfile.alicdn.com/6a67f26ade4ee32271cb584c7993fdd686fdedea.png?x-oss-process=image/resize,w_1400/format,webp)
2.5 SSH互信配置
ssh-keygen -t rsa
# 三次回车生成密钥(每一个节点都需要执行)
![0a8a8e3d25000ba0f5962c89c980f2640d5c2cd0](https://yqfile.alicdn.com/0a8a8e3d25000ba0f5962c89c980f2640d5c2cd0.png?x-oss-process=image/resize,w_1400/format,webp)
# 生成公钥(主节点执行)
cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys
chmod 600 /root/.ssh/authorized_keys
# 复制其他节点的公钥(主节点执行)
ssh slave1 cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
ssh slave2 cat /root/.ssh/id_rsa.pub >> /root/.ssh/authorized_keys
![b33b0d10d8ea8c1303e8672658448175e67e4503](https://yqfile.alicdn.com/b33b0d10d8ea8c1303e8672658448175e67e4503.png?x-oss-process=image/resize,w_1400/format,webp)
# 复制公钥到其他节点(主节点执行)
scp /root/.ssh/authorized_keys root@slave1:/root/.ssh/authorized_keys
scp /root/.ssh/authorized_keys root@slave2:/root/.ssh/authorized_keys
![22bff3ea90a8ea757d0e874ee6129f0231240b5c](https://yqfile.alicdn.com/22bff3ea90a8ea757d0e874ee6129f0231240b5c.png?x-oss-process=image/resize,w_1400/format,webp)
# 免密SSH测试
ssh slave1 ip addr
ssh slave2 ip addr
![de27e8025cb95a66288266c023a563d2dd6ae396](https://yqfile.alicdn.com/de27e8025cb95a66288266c023a563d2dd6ae396.png?x-oss-process=image/resize,w_1400/format,webp)
到这一步,我们已经搞定了基础环境的配置,主要是针对时间、主机名、防火墙等服务进行配置。
3. JDK环境的安装
cd /usr/local/src
tar zxvf jdk-8u191-linux-x64.tar.gz
# 配置环境变量,在配置最后加入
vim /etc/profile
JAVA_HOME=/usr/local/src/jdk1.8.0_191
JAVA_BIN=/usr/local/src/jdk1.8.0_191/bin
JRE_HOME=/usr/local/src/jdk1.8.0_191/jre
CLASSPATH=/usr/local/src/jdk1.8.0_191/jre/lib:/usr/local/src/jdk1.8.0_191/lib:/usr/local/src/jdk1.8.0_191/jre/lib/charsets.jar
PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
![785594ae752e34aff7a41f4ec7664a760a9e488a](https://yqfile.alicdn.com/785594ae752e34aff7a41f4ec7664a760a9e488a.png?x-oss-process=image/resize,w_1400/format,webp)
# 复制环境变量到其他节点
scp /etc/profile root@slave1:/etc/profile
scp /etc/profile root@slave2:/etc/profile
# 复制JDK包到其他节点
scp -r /usr/local/src/jdk1.8.0_191 root@slave1:/usr/local/src/jdk1.8.0_191
scp -r /usr/local/src/jdk1.8.0_191 root@slave2:/usr/local/src/jdk1.8.0_191
# 重新加载环境变量
source /etc/profile
# 测试环境是否配置成功
java -version
![fbc5393be321d6d20de5a4ead138f8b4d2a7704a](https://yqfile.alicdn.com/fbc5393be321d6d20de5a4ead138f8b4d2a7704a.png?x-oss-process=image/resize,w_1400/format,webp)
到这一步,我们已经安装和配置了JAVA运行环境,因为Hadoop是Java开发的,所以我们必须需要在JAVA环境上运行Hadoop。
5. Hadoop环境安装
# 解压Hadoop包
cd /usr/local/src
tar zxvf hadoop-2.6.5.tar.gz
# 修改配置文件 在第24行添加Java的环境变量
cd hadoop-2.6.5/etc/hadoop/
vim hadoop-env.sh
export JAVA_HOME=/usr/local/src/jdk1.8.0_191
![b82c2983e9658bf8e428eba4c242f3cda3dc8186](https://yqfile.alicdn.com/b82c2983e9658bf8e428eba4c242f3cda3dc8186.png?x-oss-process=image/resize,w_1400/format,webp)
# 修改配置文件 在第24行添加Java的环境变量
vim yarn-env.sh
export JAVA_HOME=/usr/local/src/jdk1.8.0_191
![350fa25cdcdfcb839904f2f4b996ad61ac1e6d51](https://yqfile.alicdn.com/350fa25cdcdfcb839904f2f4b996ad61ac1e6d51.png?x-oss-process=image/resize,w_1400/format,webp)
# 修改配置文件 添加从节点主机名
vim slaves
slave1
slave2
# 修改配置文件 添加RPC配置
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://172.18.53.98:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/usr/local/src/hadoop-2.6.5/tmp</value>
</property>
</configuration>
![10523bb35a494960438c830c5ef75d7448709623](https://yqfile.alicdn.com/10523bb35a494960438c830c5ef75d7448709623.png?x-oss-process=image/resize,w_1400/format,webp)
# 修改配置文件 添加DFS配置
<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/src/hadoop-2.6.5/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/src/hadoop-2.6.5/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
# 修改配置文件 添加MR配置
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
![ec30bd722f21d88c36c06352074f6c8f3506cd51](https://yqfile.alicdn.com/ec30bd722f21d88c36c06352074f6c8f3506cd51.png?x-oss-process=image/resize,w_1400/format,webp)
# 修改配置文件 添加资源管理配置
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8035</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
</property>
</configuration>
mkdir /usr/local/src/hadoop-2.6.5/tmp
mkdir -p /usr/local/src/hadoop-2.6.5/dfs/name
mkdir -p /usr/local/src/hadoop-2.6.5/dfs/data
vim /etc/profile
HADOOP_HOME=/usr/local/src/hadoop-2.6.5
export PATH=$PATH:$HADOOP_HOME/bin
scp /etc/profile root@slave1:/etc/profile
scp /etc/profile root@slave2:/etc/profile
scp -r /usr/local/src/hadoop-2.6.5 root@slave1:/usr/local/src/hadoop-2.6.5
scp -r /usr/local/src/hadoop-2.6.5 root@slave2:/usr/local/src/hadoop-2.6.5
source /etc/profile
hadoop namenode -format
![a98d62aa03aa08cac44a97b1af08c4eba3a9d743](https://yqfile.alicdn.com/a98d62aa03aa08cac44a97b1af08c4eba3a9d743.png?x-oss-process=image/resize,w_1400/format,webp)
# 提示下列内容即是成功完成格式化
common.Storage: Storage directory /usr/local/src/hadoop-2.6.5/dfs/name has been successfully formatted
# 启动集群
/usr/local/src/hadoop-2.6.5/sbin/start-all.sh
![f6a9aac788bfba042e6be6d7f5d67dd8555633b5](https://yqfile.alicdn.com/f6a9aac788bfba042e6be6d7f5d67dd8555633b5.png?x-oss-process=image/resize,w_1400/format,webp)
# 查看服务进程
# Master: ResourceManager - Namenode - SecondaryNameNode
![07828e3b12e13c695067d6cfc6e859aa77c3908c](https://yqfile.alicdn.com/07828e3b12e13c695067d6cfc6e859aa77c3908c.png?x-oss-process=image/resize,w_1400/format,webp)
# Slave: NodeManager - DataNode
![6f1053e700be4383ec2cfd163651a140dec5a48b](https://yqfile.alicdn.com/6f1053e700be4383ec2cfd163651a140dec5a48b.png?x-oss-process=image/resize,w_1400/format,webp)
![4b97077a70101fd1b61f9f3517bb301d4249395e](https://yqfile.alicdn.com/4b97077a70101fd1b61f9f3517bb301d4249395e.png?x-oss-process=image/resize,w_1400/format,webp)
# 网页控制台 (需要修改本地hosts文件 添加主机记录)
# Windows C:\Windows\System32\drivers\etc
# Linux /etc/hosts
# Mac /etc/hosts
![63d9041dd9066570fb8ea21602bd818bfef61aa7](https://yqfile.alicdn.com/63d9041dd9066570fb8ea21602bd818bfef61aa7.png?x-oss-process=image/resize,w_1400/format,webp)
# Yarn管理界面
# 浏览器访问 http://master:8088/cluster
![9b669ee322924b569784877931b708833f10a4af](https://yqfile.alicdn.com/9b669ee322924b569784877931b708833f10a4af.png?x-oss-process=image/resize,w_1400/format,webp)
# HDFS管理界面
# 浏览器访问 http://master:50070
![93cfed35e8888d99197423ba7e8c921a49dbdccd](https://yqfile.alicdn.com/93cfed35e8888d99197423ba7e8c921a49dbdccd.png?x-oss-process=image/resize,w_1400/format,webp)