开发者学堂课程【快速掌握 Hadoop 集成 Kerberos 安全技术:配置 HDFS-编译 executor-container】学习笔记,与课程紧密联系,让用户快速学习知识。
课程地址:https://developer.aliyun.com/learning/course/708/detail/12562
配置 HDFS-编译 executor-container
内容介绍:
一、安装 protobuf
二、设置 hadoop 需要使用的各个目录的权限
一.安装 protobuf
构建 protobuf,只需要安装在 cdh0,cdh0 安装好后复制到其它机器。
1.上传安装包内的 protobuf-2.5.0.tar.gz,解压进入目录执行:
执行./configure 回车
[root@cdh0 protobuf-2.5.0]#./configure
执行 make 回车,make 比较慢在编译中
[root@cdh0 protobuf-2.5.0]#make
执行 make install
[root@cdh0 protobuf-2.5.0]#make install
protobuf 安装完成
2.编译源码构建 Linux-Container-executor
使用 Kerberos 需要使用 Linux-container-executer 的容器来运行 YARN 任务,容器需要编译源码构建出来,基于 cgroup 作为资源隔离的基本单位。
如果编译不出来可以(基于 centos6.9cdh5.14.4版本构建),但是不100%确保能够正确运行,如果版本一致,基本可以正常使用。如果系统是 centos7,有微小区别不能100%运行,最好自己构建出100%能运行。
复制粘贴
·cd$HADOOP_HOME/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
pwd 查看路径,/bigdata/hadoop-2.6.0-cdh5.14.4目录下
src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-serve r-nodemanager
[root@cdh0~]#cd$HADOOP_HOME/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server -nodemanager
[root@cdh0 hadoop-yarn-server-nodemanager]#pwqd
-bash: pwqd: command not found
[root@cdh0 hadoop-yarn-server-nodemanager]# pwd
/bigdata/hadoop-2.6.0-cdh5.14.4/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-serve r-nodemanager
使用 pom 编译
rw-r--r-- 1 1106 4001 12076 Jun 12 2018 pom.xml
rwxr-xr-x4 1106 4001 4096 Jun 12 2018 src
.执行 mvn package -Pdist,native -DskipTests -Dtar-Dconta
iner-executor.conf.dir=/bigdata/hadoop-2.6.0-cdh5.14.4/etc/hadoop/
复制粘贴,执行
[root@cdh0 hadoop-yarn-server-nodemanager]#mvn package -P
dist,native -DskipTests -Dtar-Dcontainer-executor.conf.dir=/bigda
ta/hadoop-2.6.0-cdh5.14.4/etc/hadoop
echo 当前$JAVA_HOME,jdk1.8执行,后面肯定因为1.8报错。
[root@cdh0 hadoop-yarn-server-nodemanager]#echo$JA
VA_HOME
/usr/local/jdk1.8.0 221
会将编译好的 maven 库下发,有本地 maven 库省去网络下载过程,节省时间。
出现错误,不能解析 dependencies,网络问题,重跑。
加入代理
Dhttps.protocols=TLSv1.1.TLSv1.2-Dhttp.proxyHost=192.168.66.1-Dhttp.proxyPort=1080-Dhttps.proxyHost=192.168.66.1. Dhttps.proxyPort=1080
[root@cdh0 hadoop-yarn-server-nodemanager]#mvn package-P
dist,native -DskipTests -Dtar-Dcontainer-executor.conf.dir=/bigda
ta/hadoop-2.6.0-cdh5.14.4/hadoop/-Dhttps.protocols=TLSv1.1.TLSv1.2-Dhttp.proxyHost=192.168.66.1-Dhttp.proxyPort=1086-Dhttps.proxyHost=192.168.66.1. Dhttps.proxyPort=1086
有时连不上网,需要网络环境。所处网络环境不好,需要使用代理改善网络环境。
编译完成一半,报错。语法上的错误。
ils.java:69:error: bad use of '>'
[ERROR] * This is percent >0 and <= 100 based on
ERROR] ^
[ERROR]/bigdata/hadoop-2.6.0-cdh5.14.4/src/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yacn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/util/NodeManagerHardwareUt ils.java:69: error: malformed HTML
ERROR] * This is percent > 0 and <= 100 based on
查看目录下 pom 文件
[root@cdh0 hadoop-yarn-server-nodemanager]#vim pom.xml
使用 jdk1.7
E86:Pattern not found:1\.7
代码使用1.7写,使用 jdk1.8,会出现语法错误。
[root@cdh0 hadoop-yarn-server-nodemanager]#echo$JA
VA_HOME/usr/local/jdk1.8.0_221
代码需要连网下载依赖,官方仓库下载,官方仓库支持的 https tsl1.2版本,jdk1.7默认1.1,会出现问题。需要使用1.8版本将相关内容下载完成,编译报语法错误没关系,将 jdk 改为1.7。内容下载完成,可以使用1.7编译。
[root@cdh0 hadoop-yarn-server-nodemanager]# cd ^C
[root@cdh0 hadoop-yarn-server-nodemanager]# cd/usr/
local/jdk1.7.0_65^C
修改 bigdata bin/mvn 程序
[root@cdh0 hadoop-yarn-server-nodemanager]# vim/bigdata/a
pache-maven-3.0.5/bin/mvn
加入 export JAVA_HOME=usr/local/jdk1.7.0_65
再次执行编译,可正常编译。
[root@cdh0 hadoop-yarn-server-nodemanager]#mvn package -P
dist,native -DskipTests -Dtar-Dcontainer-executor.conf.dir=/bigda
ta/hadoop-2.6.0-cdh5.14.4/hadoop/-Dhttps.protocols=TLSv1.1.TLSv1.2-Dhttp.proxyHost=192.168.66.1-Dhttp.proxyPort=1086-Dhttps.proxyHost=192.168.66.1. Dhttps.proxyPort=1086
网上下载内容过程由1.8完成,使用1.7不能下载。开始执行,没有下载。进行源代码编译过程
[INFO]BUILD SUCCESS 说明编译成功
增多 target 文件夹,cd 到文件夹
[root@cdh0 hadoop-yarn-server-nodemanager]#cd target/
继续向下走有 native 文件夹,里边有 target,usr,local,bin container
-executor、test-container-executor
container-executor 是需要的文件,将文件全部复制。复制到 bigdat
a hadoop 目录下的 bin
-rwxr-xr-x 1 root root 141591 Sep 27 19:19 container-executor
-rwxr-xr-x 1 root root 151412 Sep 27 19:19 test-container-Executor
[root@cdh0 bin]#cp*/bigdata/hadoop-2.6.0-cdh5.14.4/bin/
[root@cdh0 bin]# cd
切换到 hadoop bin 多了 container-executor、test-container-exec
utor 文件
[root@cdh0~]#cd/bigdata/hadoop-26.0-cdh5.14.4/bin
[root@cdh0 bin]# 11
total 380
-rwxr-xr-x1root root141591 Sep 27 19:44 container-executor
-rwxr-xr-x1 1106 4001 5701 Jun 12 2018 hadoop
-rwxr-xr-x 1 1106 4001 8443 Jun 12 2018 hadoop.cmd
-rwxr-xr-x11106 4001 12356 Jun 12 2018 hdfs
-rwxr-xr-x 1 1106 4001 6915 Jun 12 2018 hdfs.cmd
-rwxr-xr-x11106 4001 5463 Jun 12 2018 mapred
-rwxr-xr-x11106 4001 5949 Jun 12 2018 mapred.cmd
-rwxr-xr-x 1 1106 4001 1776 Jun 12 2018 rcc
-rwxr-xr-x1root root 151412 Sep 27 19:44 test-container-executor
-rwxr-xr-x11106 4001 12476 Jun 12 2018 yarn
-rwxr-xr-x11106 4001 10895 Jun 12 2018 yarn.cmd
复制到 cdh1,`pwd`表示当前目录
[root@cdh0 bin]#scp container-executor test-container-executor c
dh1:`pwd`/
container-executor 100% 138KB 138.3KB/s 00:00
test-container-executor 100% 148KB 147.9KB/s 00:00
复制到 cdh2
[root@cdh0 bin]#scp container-executor test-container-executor c
dh2:`pwd`/
container-executor 100% 138KB 138.3KB/s 00:00
test-container-executor 100% 148KB 147.9KB/s 00:00
两个文件编译并复制成功。
maven 本地仓库打包到安装包里,可以上传 maven 仓库跳过下载过程。
二.执行如下脚本,设置 hadoop 需要使用的各个目录的权限
chown hdfs:hadoop SHADOOP_HOME/sbin/distribute-excLude.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/hadoop-daemon.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/hadoop-daemons.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/hdfs-config.cmd
chown hdfs:hadoop SHADOOP_HOME/sbin/hdfs-config.sh
chown mapred:hadoop SHADOOP_HOME/sbin/mr-jobhis tory-da
emon.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/refresh-namenodes.s
h
chown hdfs:hadoop SHADOOP_HOME/sbin/slaves.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/start-all.cmd chown hdfs:hadoop SHADOOP_HOME/sbin/start-all.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/start-dfs.sh
chown yarn:hadoop SHADOOP_HOME/sbin/start-yarn.sh chown hdfs:hadoop SHADOOP_HOME/sbin/stop-all.cmd chown
hdfs:hadoop SHADOOP_HOME/sbin/stop-all.sh
chown hdfs:hadoop SHADOOP_HOME/sbin/stop-dfs.cmd chown hdfs:hadoop SHADOOP_HOME/sbin/stop-dfs.sh
chown mapred:hadoop $HADOOP_HOME/bin/mapred*
chown yarn:hadoop $HADOOP_HOME/bin/yarn*
chown hdfs:hadoop SHADOOP_HOME/bin/hdfs*
chmod 755 -R SHADOOP_HOME/etc/hadoop/*
chown root:hadoop SHADOOP_HOME/etc
chown root:hadoop $HADOOP_HOME/etc/hadoop
chown root:hadoop SHADOOP_HOME/etc/hadoop/container-exe
cutor.cfo
chown root:hadoop SHADOOP_HOME/bin/container-executor
chown root:hadoop SHADOP_HOME/bin/test-container-executor
chmod 6050 SHADOOP_HOME/bin/container-executor
chown 6050 SHADOOP_HOME/bin/test-container-executor
mkdir SHADOOP_HOME/logs
mkdir $HADOOP_HOME/logs/hdfs
mkdir SHADOOP_HOME/logs/yarn
chown root:hadoop SHADOOP_HOME/logs
chmod 775 $HADOOP_HOME/logs
chown hdfs:hadoop SHADOOP_HOME/logs/hdfs
chmod 755 -R SHADOOP_HOME/logs/hdfs
chown yarn:hadoop SHADOOP_HOME/logs/yarn
chmod 755 -R SHADOOP_HOME/logs/yarn
chown-R hdfs:hadoop SDFS_DATANODE_DATA_DIR
chown -Rhdfs:hadoop SDFS NAMENODE NAME DIR
chmod 700 SDFS_DATANODE_DATA_DIR
chmod 700 $DFS_NAMENODE_NAME_DIR
chown -Ryarn:hadoop SNODEMANAGER_LOCAL_DIR
chown -R yarn:hadoop SNODEMANAGER_LOG_DIR
chmod 770 SNODEMANAGER_LOCAL_DIR
chmod 770 SNODEMANAGER_LOG_DIR
chown -R mapred:hadoop $MR_HISTORY
chmod 770 SMR_HISTORY
HADOOP_HOME sbin 里将执行的 sh 脚本文件设置为 hd fs 账户权限,mr 设置为 mapred 权限,bin mr 开头设置为 mr 账户,yarn 开头设置为 yarn 账户,hdfs 开头设置为 hdfs 账户。
etc 配置文件根据对应的账户设置对应的权限,etc hadoop 里给755权限,etc、etc/hadoop 文件夹设置为 root 账户权限等等。container-executor、test-container-executor 需要设置为6050权限
复制到 sh 文件,执行。
# 其中内部的配置根据自己的目录设置修改
HADOOP_HOME=/bigdata/hadoop-2.6.0-cdh5.14.4 DFS_NAMENODE_NAME_DIR=/data/nn DFS_DATANODE_DATA_DIR=/data/dn
NODEMANAGER_LOCAL_DIR=/data/nm-local NODEMANAGER_L0G_DIR=/data/nm-log MR_HISTORY=/data/mr-history
注意变量 HADOOP_HOME、NAMENODE、DATANODE 文件路径等等,路径根据自己机器上的具体路径进行设置。
复制,回到服务器 cdh1,创建 a.sh,所有东西放入 a.sh
[root@cdh0~]#vim a.sh
执行 sh a.sh,出现报错 logs 文件已经存在。
[root@cdh0 ~]# vim a.sh
[root@cdh0 ~]# sh a.sh
mkdir: cannot create directory`/bigdata/hadoop-2.6.0-cdh5.14.4/
logs': File exists
mkdir: cannot create directorv`/bigdata/hadoop-2.6.0-cdh5.14.4/
logs/yarn': File exists
查看 cd /bigdata/hadoop-2.6.0-cdh5.14.4/脚本已经创建 logs 文件夹
[root@cdh0 ~]#cd /bigdata/hadoop-2.6.0-cdh5.14.4/
[root@cdh0 hadoop-2.6.0-cdh5.14.4]#11
total 156
drwxr-xr-x 2 hdfs hadoop 4096 Sep 27 19:44 bin
drwxr-xr-x2hdfs hadoop 4096 Jun 12 2018 bin-mapreducel
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 cloudera
drwxr-xr-x 6 root hadoop 4096 Jun 12 2018 etc
drwxr-xr-x 5 hdfs hadoop 4096 Jun 12 2018 examples
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 examples-mapreducel
drwxr-xr-x 2 hdfs hadoop 4096 Jun 12 2018 include
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 lib
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 libexec
-rw-r--r-- 1 hdfs hadoop 85063 Jun 12 2018 LICENSE.txt
drwxrwxr-x 5 root hadoop 4096 Sep 27 19:47 logs
-rw-r--r-- 1 hdfs hadoop 14978 Jun 12 2018 NOTICE.txt
-rw-r--r-- 1 hdfs hadoop 1366 Jun 12 2018 README.txt
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 sbin
drwxr-xr-x 4 hdfs hadoop 4096 Jun 12 2018 share
drwxr-xr-x18 hdfs hadoop 4096 Jun 12 2018 src
将 sh 文件复制到 cdh1、cdh2 相同操作。
三台机器执行完成后到 bigdata 目录下 hadoop 设置为 root 账户,hadoop 组大部分权限是 hdfs 账户,小部分是 root 账户。
[root@cdh0 bigdata]#cd hadoop-2.6.0-cdh5.14.4/
[root@cdh0 hadoop-2.6.0-cdh5.14.4]#11
total 156
drwxr-xr-x 2 hdfs hadoop 4096 Sep 27 19:44 bin
drwxr-xr-x2hdfs hadoop 4096 Jun 12 2018 bin-mapreducel
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 cloudera
drwxr-xr-x 6 root hadoop 4096 Jun 12 2018 etc
drwxr-xr-x 5 hdfs hadoop 4096 Jun 12 2018 examples
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 examples-mapreducel
drwxr-xr-x 2 hdfs hadoop 4096 Jun 12 2018 include
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 lib
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 libexec
-rw-r--r-- 1 hdfs hadoop 85063 Jun 12 2018 LICENSE.txt
drwxrwxr-x 5 root hadoop 4096 Sep 27 19:47 logs
-rw-r--r-- 1 hdfs hadoop 14978 Jun 12 2018 NOTICE.txt
-rw-r--r-- 1 hdfs hadoop 1366 Jun 12 2018 README.txt
drwxr-xr-x 3 hdfs hadoop 4096 Jun 12 2018 sbin
drwxr-xr-x 4 hdfs hadoop 4096 Jun 12 2018 share
drwxr-xr-x18 hdfs hadoop 4096 Jun 12 2018 src
Kerberos 要求 etc 的 root 权限,包括 hadoop 目录也是 root 权限。
[root@cdh0 hadoop-2.6.0-cdh5.14.4]#cd etc/
[root@cdh0 etc]# 11
total 16
drwxr-xr-x 2 root hadoop 4096 Jun 12 2018 hadoop
drwxr-xr-x2 hdfs hadoop 4096 Jun 12 2018 hadoop-mapreducel
drwxr-xr-x 2 hdfs hadoop 4096 Jun 12 2018 hadoop-mapreducel-
pseudo
drwxr-xr-x2 hdfs hadoop 4096 Jun 12 2018 hadoop-mapreducel-s
ecure
hadoop 里有具体的 hdfs、mapred,
-rwxr-xr-x1root hadoop 318 Jun 12 2018 container-executor.cfg
-rwxr-xr-x1hdfs hadoop 774 Jun 12 2018 core-site.xml
-rwxr-xr-x 1 hdfs hadoop 3670 Jun 12 2018 hadoop-env.cmd
-rwxr-xr-x 1 hdfs hadoop 4224 Jun 12 2018hadoop-env.sh
-rwxr-xr-x 1 hdfs hadoop 2598 Jun 12 2018 hadoop-metrics2.prop
erties
-rwxr-xr-x1hdfs hadoop 2490 Jun 12 2018 hadoop-metrics.propert
ies
配置文件要求 container-executor.cfg 所在目录,此级以上文件夹都属于 root 账户,因为 container-executor.cfg 的特殊要求,大部分有 xr 权限可读,其它改为与 hdfs 对应,防止其它用户篡改配置文件。
权限的设置有很大讲究,MR_HSTORY 必须设置为770,除了本身有权限外,还需要给主用户权限。