1 前言
Cloudera是基于Apache原生的Hadoop组件进行重新封装和加强,Cloudera可以简化Hadoop组件的部署。
更新版本的配置请参阅(CentOS 7):
http://cmdschool.blog.51cto.com/2420395/1916322
2 理论基础
2.1 部署软件架构
1)Oracle JDK
2)Cloudera Manager Server and Agent packages
3)Supporting database software
4)CDH and managed service software
2.2 部署步骤和安装方法
2.2.1 安装方法
A)Cloudera Manager安装程序安装(容易)
B)yum源方式安装(中等)
C)源代码安装(难)
注:本教程使用方法B
2.2.2 部署步骤
1)安装JDK
2)安装并配置数据库
3)安装Cloudera管理服务端
4)安装Cloudera管理代理端
5)安装CDH管理服务软件
6)创建、启动和配置CDH并管理服务
2.3 Cloudera Manager端的相关文件
1
|
rpm -ql cloudera-manager-server
|
显示如下:
1
2
3
4
5
6
7
8
9
10
|
/etc/cloudera-scm-server
/etc/cloudera-scm-server/db
.properties
/etc/cloudera-scm-server/log4j
.properties
/etc/default/cloudera-scm-server
/etc/rc
.d
/init
.d
/cloudera-scm-server
/opt/cloudera/csd
/opt/cloudera/parcel-repo
/usr/sbin/cmf-server
/var/log/cloudera-scm-server
/var/run/cloudera-scm-server
|
文件与目录功能如下:
1)其中/etc/的2-4行为Cloudera Manager服务端配置文件
2)/opt/cloudera/parcel-repo为下载安装包存放目录
3 实践部分
3.1 环境信息
3.1.1 系统信息
OS = CentOS 6.6 x86_64
注:系统请使用最小化安装,否则可能Sqoop服务可能无法启动
3.1.2 主机信息
Cloudera Manager:
ip address=10.168.0.120
hostname=cdm-m.cmdschool.org
Cloudera Host1:
ip address=10.168.0.121
hostname=cdm-h1.cmdschool.org
Cloudera Host2:
ip address=10.168.0.122
hostname=cdm-h2.cmdschool.org
Cloudera Host3:
ip address=10.168.0.123
hostname=cdm-h3.cmdschool.org
Cloudera Host4:
ip address=10.168.0.124
hostname=cdm-h4.cmdschool.org
3.2 运行环境配置
In Cloudera Manager & Cloudera Host[1-4]
3.2.1 关闭selinux
1
|
getenforce
|
如果显示如下:
1
|
Enforcing
|
则执行:
1
2
|
setenforce 0
sed
-i
's/SELINUX=enforcing/SELINUX=disabled/g'
/etc/selinux/config
|
3.2.2 配置hosts
vim编辑/etc/hosts
1
2
3
4
5
|
10.168.0.120 cdm-m.cmdschool.org
10.168.0.121 cdm-h1.cmdschool.org
10.168.0.122 cdm-h2.cmdschool.org
10.168.0.123 cdm-h3.cmdschool.org
10.168.0.124 cdm-h4.cmdschool.org
|
3.2.3 检查主机名称
1
|
hostname
|
注:主机名称与上面不一致会影响服务的启动
3.2.4 配置sudo(单用户模式适用,可选)
1
|
visudo
|
增加如下组
1
|
%cloudera-scm ALL=(ALL) NOPASSWD: ALL
|
确认包含如下行:
1
|
Defaults secure_path =
/sbin
:
/bin
:
/usr/sbin
:
/usr/bin
|
vim编辑/etc/pam.d/su,确保包含如下行
1
|
session required pam_limits.so
|
3.2.5 关闭防火墙并设置开机不启动
1
2
|
/etc/init
.d
/iptables
stop
chkconfig iptables off
|
3.2.6 优化虚拟内存需求率
1)检查虚拟内存需求率
1
|
cat
/proc/sys/vm/swappiness
|
显示如下:
1
|
|
2)临时降低虚拟内存需求率
1
|
sysctl vm.swappiness=0
|
3)永久降低虚拟内存需求率
vim编辑/etc/sysctl.conf
1
2
|
kernel.shmall = 4294967296
vm.swappiness = 0
|
并运行如下命令使生效
1
|
sysctl -p
|
3.2.7 解决透明大页面问题
1)检查透明大页面问题
1
|
cat
/sys/kernel/mm/transparent_hugepage/defrag
|
如果显示为:
1
|
[always] madvise never
|
2)临时关闭透明大页面问题
1
|
echo
never >
/sys/kernel/mm/transparent_hugepage/defrag
|
确认配置生效:
1
|
cat
/sys/kernel/mm/transparent_hugepage/defrag
|
应该显示为:
1
|
always madvise [never]
|
3)配置开机自动生效
vim编辑/etc/rc.local,加入如下内容
1
|
echo
never >
/sys/kernel/mm/transparent_hugepage/defrag
|
3.3 yum源的安装配置
3.3.1 公共yum源配置
In Cloudera Manager & Cloudera Host[1-4]
1)配置yum源
下载默认yum源
1
|
wget -P
/etc/yum
.repos.d/ https:
//archive
.cloudera.com
/cm5/redhat/6/x86_64/cm/cloudera-manager
.repo
|
修改为指定版本yum源
vim编辑/etc/yum.repos.d/cloudera-manager.repo修改如下参数:
1
|
baseurl=https:
//archive
.cloudera.com
/cm5/redhat/6/x86_64/cm/5
.6.0/
|
2)安装配置工具
1
|
yum
install
-y vim wget openssh-clients
|
3)安装jdk
1
|
yum
install
-y oracle-j2sdk1.7
|
4)安装python
1
|
yum
install
-y python
|
5)安装ntpd
1
|
yum
install
-y ntp
|
3.3.2 Cloudera Manager端yum源配置
In Cloudera Manager
1)安装Cloudera Manager包
1
|
yum
install
-y cloudera-manager-daemons cloudera-manager-server
|
2)安装mysql
1
|
yum
install
-y mysql-server mysql-devel mysql
|
3.3.3 Cloudera Manager Agent端yum源配置
In Cloudera Host[1-4]
安装Cloudera Manager Agent包
1
|
yum
install
-y cloudera-manager-agent cloudera-manager-daemons
|
3.4 依赖于yum源的环境配置
3.4.1 配置jdk变量环境
In Cloudera Manager & Cloudera Host[1-4]
1)vim编辑/etc/profile,末尾加入如下内容
1
2
3
4
|
export
JAVA_HOME=
/usr/java/jdk1
.7.0_67-cloudera
export
JRE_HOME=${JAVA_HOME}
/jre
export
CLASSPATH=.:${JAVA_HOME}
/lib
:${JRE_HOME}
/lib
export
PATH=${JAVA_HOME}
/bin
:$PATH
|
2)导入java环境变量
1
|
source
/etc/profile
|
3)测试jdk的配置
1
|
java -version
|
3.4.2 权限检查(单用户模式适用,可选)
In Cloudera Manager & Cloudera Host[1-4]
检查以下目录cloudera-scm用户具有完全的权限
检查当前目录权限:
1
|
ls
-ld
/opt/cloudera/
|
显示如下
1
|
drwxr-xr-x. 4 cloudera-scm cloudera-scm 4096 May 23 13:51
/opt/cloudera/
|
检查子目录权限:
1
|
ls
-lR
/opt/cloudera/
|
显示如下
1
2
3
4
5
6
7
8
9
10
|
/opt/cloudera/
:
total 8
drwxr-xr-x. 2 cloudera-scm cloudera-scm 4096 Feb 12 11:28 csd
drwxr-xr-x. 2 cloudera-scm cloudera-scm 4096 Feb 12 11:28 parcel-repo
/opt/cloudera/csd
:
total 0
/opt/cloudera/parcel-repo
:
total 0
|
同样,检查服务器或客户端目录权限
1
2
3
4
|
ls
-ld
/var/log/cloudera-scm-server/
ls
-lR
/var/log/cloudera-scm-server/
ls
-ld
/var/lib/cloudera-scm-agent/
ls
-lR
/var/lib/cloudera-scm-agent/
|
3.4.3 检查线程限制配置
In Cloudera Manager & Cloudera Host[1-4]
1
|
cat
/etc/security/limits
.d
/cloudera-scm
.conf
|
显示如下:
1
2
3
4
5
6
7
8
|
#
# (c) Copyright 2014 Cloudera, Inc.
#
cloudera-scm soft nofile 32768
cloudera-scm soft nproc 65536
cloudera-scm hard nofile 1048576
cloudera-scm hard nproc unlimited
cloudera-scm hard memlock unlimited
|
3.4.4 Cloudera Manager端配置
In Cloudera Manager
1)临时校对时间
1
|
ntpdate 0.centos.pool.ntp.org
|
2)启动并配置ntpd服务自动启动
1
2
|
/etc/init
.d
/ntpd
start
chkconfig ntpd on
|
3.4.5 Cloudera Manager Agen端配置
In Cloudera Host[1-4]
1)临时校对时间
1
|
ntpdate 10.168.0.120
|
2)vim编辑/etc/ntp.conf
注释掉外网时间服务器并增加内网时间服务器地址
1
2
3
4
5
|
#server 0.centos.pool.ntp.org iburst
#server 1.centos.pool.ntp.org iburst
#server 2.centos.pool.ntp.org iburst
#server 3.centos.pool.ntp.org iburst
server 10.168.0.120 iburst
|
3)启动并配置ntpd服务自动启动
1
2
|
/etc/init
.d
/ntpd
start
chkconfig ntpd on
|
3.4.7 安装MySQL JDBC Driver
In Cloudera Manager & Cloudera Host[1-4]
1
2
3
4
|
wget http:
//dev
.mysql.com
/get/Downloads/Connector-J/mysql-connector-java-5
.1.39.
tar
.gz
tar
zxvf mysql-connector-java-5.1.39.
tar
.gz
mkdir
/usr/share/java/
cp
mysql-connector-java-5.1.39
/mysql-connector-java-5
.1.39-bin.jar
/usr/share/java/mysql-connector-java
.jar
|
3.4.8 配置公钥认证
In Cloudera Manager
In Cloudera Manager:
1
|
ssh
-keygen -t rsa
|
注:以上一路回车
In Cloudera Manager Agen:
1
2
3
4
5
|
ssh
-copy-
id
-i ~/.
ssh
/id_rsa
.pub root@10.168.0.120
ssh
-copy-
id
-i ~/.
ssh
/id_rsa
.pub root@10.168.0.121
ssh
-copy-
id
-i ~/.
ssh
/id_rsa
.pub root@10.168.0.122
ssh
-copy-
id
-i ~/.
ssh
/id_rsa
.pub root@10.168.0.123
ssh
-copy-
id
-i ~/.
ssh
/id_rsa
.pub root@10.168.0.124
|
In Cloudera Manager:
1
2
3
4
5
|
ssh
10.168.0.120
ssh
10.168.0.121
ssh
10.168.0.122
ssh
10.168.0.123
ssh
10.168.0.124
|
注:以上如果无需密码登记即成功
3.5 Cloudera Manager安装配置
In Cloudera Manager
3.5.1 修改mysql参数
1)关闭数据库
1
|
/etc/init
.d
/mysqld
stop
|
2)备份ib_logfile文件
1
2
3
|
mkdir
/var/lib/backup
cd
/var/lib/mysql/
mv
ib_logfile*
/var/lib/backup/
|
3)vim编辑/etc/my.cnf
加入如下参数
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
|
[mysqld]
transaction-isolation = READ-COMMITTED
# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
# symbolic-links = 0
key_buffer = 16M
key_buffer_size = 32M
max_allowed_packet = 32M
thread_stack = 256K
thread_cache_size = 64
query_cache_limit = 8M
query_cache_size = 64M
query_cache_type = 1
max_connections = 550
#expire_logs_days = 10
#max_binlog_size = 100M
#log_bin should be on a disk with enough free space. Replace '/var/lib/mysql/mysql_binary_log' with an appropriate path for your system
#and chown the specified folder to the mysql user.
log_bin=
/var/lib/mysql/mysql_binary_log
# For MySQL version 5.1.8 or later. Comment out binlog_format for older versions.
binlog_format = mixed
read_buffer_size = 2M
read_rnd_buffer_size = 16M
sort_buffer_size = 8M
join_buffer_size = 8M
# InnoDB settings
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_log_buffer_size = 64M
innodb_buffer_pool_size = 4G
innodb_thread_concurrency = 8
innodb_flush_method = O_DIRECT
innodb_log_file_size = 512M
[mysqld_safe]
log-error=
/var/log/mysqld
.log
pid-
file
=
/var/run/mysqld/mysqld
.pid
sql_mode=STRICT_ALL_TABLES
|
3.5.2 启动并设置开机自动启动
1
2
|
/etc/init
.d
/mysqld
start
chkconfig mysqld on
|
3.5.3 初始化数据库
1
|
mysql_secure_installation
|
向导如下:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
|
[...]
Enter current password
for
root (enter
for
none):
OK, successfully used password, moving on...
[...]
Set root password? [Y
/n
] y
New password:
Re-enter new password:
Remove anonymous
users
? [Y
/n
] y
[...]
Disallow root login remotely? [Y
/n
] n
[...]
Remove
test
database and access to it [Y
/n
] y
[...]
Reload privilege tables now? [Y
/n
] y
All
done
!
|
3.5.4 准备scm库
1)方法一
数据库配置:
1
2
3
4
|
mysql -uroot -p
create database scm default character
set
utf8;
grant all privileges on *.* to scm@
'cdm-m.cmdschool.org'
identified by
'scm'
;
flush privileges;
|
vim编辑/etc/cloudera-scm-server/db.properties修改如下参数:
1
2
3
4
5
|
com.cloudera.cmf.db.
type
=mysql
com.cloudera.cmf.db.host=cdm-m.cmdschool.org
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.password=scm
|
2)方法二(官方建议)
授权temp权限:
1
2
3
|
mysql -uroot -p
grant all privileges on *.* to
'temp'
@
'%'
identified by
'temp'
with grant option;
flush privileges;
|
生成配置文件:
1
|
/usr/share/cmf/schema/scm_prepare_database
.sh mysql -h cdm-m.cmdschool.org -utemp -ptemp --scm-host cdm-m.cmdschool.org scm scm scm
|
显示如下:
1
2
3
4
5
6
|
JAVA_HOME=
/usr/java/jdk1
.7.0_67-cloudera
Verifying that we can write to
/etc/cloudera-scm-server
Creating SCM configuration
file
in
/etc/cloudera-scm-server
Executing:
/usr/java/jdk1
.7.0_67-cloudera
/bin/java
-
cp
/usr/share/java/mysql-connector-java
.jar:
/usr/share/java/oracle-connector-java
.jar:
/usr/share/cmf/schema/
..
/lib/
* com.cloudera.enterprise.dbutil.DbCommandExecutor
/etc/cloudera-scm-server/db
.properties com.cloudera.cmf.db.
[main] DbCommandExecutor INFO Successfully connected to database.
All
done
, your SCM database is configured correctly!
|
确认生成的结果:
1
|
cat
/etc/cloudera-scm-server/db
.properties
|
显示如下:
1
2
3
4
5
6
7
8
9
10
|
# Auto-generated by scm_prepare_database.sh on Tue May 24 19:08:19 CST 2016
#
# For information describing how to configure the Cloudera Manager Server
# to connect to databases, see the "Cloudera Manager Installation Guide."
#
com.cloudera.cmf.db.
type
=mysql
com.cloudera.cmf.db.host=cdm-m.cmdschool.org
com.cloudera.cmf.db.name=scm
com.cloudera.cmf.db.user=scm
com.cloudera.cmf.db.password=scm
|
确认库访问权限:
1
2
|
mysql -uroot -p
show grants
for
scm@
'cdm-m.cmdschool.org'
;
|
显示如下:
1
2
3
4
5
6
7
|
+----------------------------------------------------------------------------------------------------------------------+
| Grants
for
scm@cdm-m.cmdschool.org |
+----------------------------------------------------------------------------------------------------------------------+
| GRANT USAGE ON *.* TO
'scm'
@
'cdm-m.cmdschool.org'
IDENTIFIED BY PASSWORD
'*45E6E3C68BDF1AC7EBB5C5A3BCBD5E9437B293BE'
|
| GRANT ALL PRIVILEGES ON `scm`.* TO
'scm'
@
'cdm-m.cmdschool.org'
|
+----------------------------------------------------------------------------------------------------------------------+
2 rows
in
set
(0.00 sec)
|
清理数据库用户授权:
1
2
|
drop user
'temp'
@
'%'
;
flush privileges;
|
3.5.5 创建附加数据库(可选)
1)附加数据库列表
Role | Database | User | Password |
Activity Monitor | amon | amon | amon_password |
Reports Manager | rman | rman | rman_password |
Hive Metastore Server | metastore | hive | hive_password |
Sentry Server | sentry | sentry | sentry_password |
Cloudera Navigator Audit Server | nav | nav | nav_password |
Cloudera Navigator Metadata Server |
navms | navms | navms_password |
2)创建数据库并配置管理账号密码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
|
mysql -uroot -p
create database amon default character
set
utf8;
grant all privileges on amon.* to
'amon'
@
'%'
identified by
'amon_password'
;
create database rman default character
set
utf8;
grant all privileges on rman.* to
'rman'
@
'%'
identified by
'rman_password'
;
create database metastore default character
set
utf8;
grant all privileges on metastore.* to
'hive'
@
'%'
identified by
'hive_password'
;
create database sentry default character
set
utf8;
grant all privileges on sentry.* to
'sentry'
@
'%'
identified by
'sentry_password'
;
create database nav default character
set
utf8;
grant all privileges on nav.* to
'nav'
@
'%'
identified by
'nav_password'
;
create database navms default character
set
utf8;
grant all privileges on navms.* to
'navms'
@
'%'
identified by
'navms_password'
;
flush privileges;
|
3.5.6 配置Oozie库(可选)
1)数据库权限配置
1
2
3
4
5
|
mysql -uroot -p
create database oozie default character
set
utf8;
grant all privileges on oozie.* to
'oozie'
@
'localhost'
identified by
'oozie'
;
grant all privileges on oozie.* to
'oozie'
@
'%'
identified by
'oozie'
;
flush privileges;
|
2)配置Oozie库所需软连接
1
2
|
cd
/opt/cloudera/parcels/CDH/lib/oozie/lib/
ln
-s
/usr/share/java/mysql-connector-java
.jar mysql-connector-java.jar
|
3.5.7 启动服务并配置开机启动
1
2
|
/etc/init
.d
/cloudera-scm-server
start
chkconfig cloudera-scm-server on
|
3.5.8 故障排除
1
|
tail
-f
/var/log/cloudera-scm-server/cloudera-scm-server
.out
|
3.6 Cloudera Manager Agent安装
In Cloudera Host[1-4]
3.6.1 创建压缩包存放目录
1
2
|
mkdir
-p
/opt/cloudera/parcels
chown
cloudera-scm:cloudera-scm
/opt/cloudera/parcels
|
3.6.2 指定管理服务器和指定包存放目录
vim编辑/etc/cloudera-scm-agent/config.ini确保参数如下并启用:
1
2
3
|
server_host=cdm-m.cmdschool.org
server_port=7182
parcel_dir=
/opt/cloudera/parcels
|
3.6.3 指定运行单用户模式的用户名(仅用于单用户模式,不配置)
vim编辑/etc/default/cloudera-scm-agent并取消以下行的注释
1
|
USER=
"cloudera-scm"
|
3.6.4 启动服务并配置服务器开机启动
1
2
|
/etc/init
.d
/cloudera-scm-agent
start
chkconfig cloudera-scm-agent on
|
3.6.5 故障排除
如下命令监控启动服务的错误输出
1
|
tail
-f
/var/log/cloudera-scm-agent/cloudera-scm-agent
.out
|
3.7 登陆配置
In Cloudera WEB Manager
界面配置部分本章节省略……
更新版本的配置请参阅(CentOS 7):