二:安装和配置HBase
参考的文章地址: https://hbase.org.cn/docs/3.html
HBase的安装可以分为以下几种方式:
①、本地模式:本地模式是最简单的安装方式, 适用于在本地单机上进行开发和测试,在本地模式下,HBase将运行在单一的java进程中,数据存储在本地文件系统中。下面是使用docker-compose的方式来安装hbase,步骤如下:
1:docker-compose文件如下:
version: "3.5"services: hbase: image: 'harisekhon/hbase' container_name: 'hbase' networks: - base-env-network ports: - '16010:16010' - '16020:16020' - '16000:16000' - '16301:16301' - '42182:2181' volumes: - ./hbase-data:/hbase-data environment: - CLUSTER_DNS=hbase - TZ=Asia/Shanghai# docker network create base-env-network networks: base-env-network: external: name: "base-env-network"
2:启动hbase
3:进入hbase容器如下:
②、完全分布式模式:完全分布式模式是在真实的分布式环境中部署,HBase的方式,在完全分布式模式下,HBase的各个组件分布在多台计算机上,以实现高可用性,容错性和性能扩展。步骤如下:
1:部署docker
# 安装yum-config-manager配置工具yum -y install yum-utils # 建议使用阿里云yum源:(推荐)#yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo yum-config-manager --add-repo http://mirrors.aliyun.com/dockerce/linux/centos/docker-ce.repo # 安装docker-ce版本yum install -y docker-ce # 启动并开机启动systemctl enable --now dockerdocker --version
2:部署docker-compose
curl -SL https://github.com/docker/compose/releases/download/v2.16.0/dockercompose-linux-x86_64 -o /usr/local/bin/docker-composechmod +x /usr/local/bin/docker-composedocker-compose --version
3:创建网络
# 创建,注意不能使用hadoop_network,要不然启动hs2服务的时候会有问题!!!docker network create hadoop-network # 查看docker network ls
4:部署zookeeper ,创建目录和文件
[root@cdh1 zookeeper]# tree . ├── docker-compose.yml ├── zk1 ├── zk2 └── zk3 3 directories, 1 file
docker-compose.yml 文件如下:
version: '3.7' # 给zk集群配置一个网络,网络名为hadoop-network networks: hadoop-network: external: true # 配置zk集群的# container services下的每一个子配置都对应一个zk节点的docker container services: zk1: # docker container所使用的docker image image: zookeeper hostname: zk1 container_name: zk1 restart: always # 配置docker container和宿主机的端口映射 ports: - 2181:2181 - 28081:8080 # 配置docker container的环境变量 environment: # 当前zk实例的id ZOO_MY_ID: 1 # 整个zk集群的机器、端口列表 ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 server.2=zk2:2888:3888;2181 server.3=zk3:2888:3888;2181 # 将docker container上的路径挂载到宿主机上 实现宿主机和docker container的数据共享 volumes: - ./zk1/data:/data - ./zk1/datalog:/datalog # 当前docker container加入名为zk-net的隔离网络 networks: - hadoop-network zk2: image: zookeeper hostname: zk2 container_name: zk2 restart: always ports: - 2182:2181 - 28082:8080 environment: ZOO_MY_ID: 2 ZOO_SERVERS: server.1=zk1:2888:3888;2181 server.2=0.0.0.0:2888:3888;2181 server.3=zk3:2888:3888;2181 volumes: - ./zk2/data:/data - ./zk2/datalog:/datalog networks: - hadoop-network zk3: image: zookeeper hostname: zk3 container_name: zk3 restart: always ports: - 2183:2181 - 28083:8080environment: ZOO_MY_ID: 3 ZOO_SERVERS: server.1=zk1:2888:3888;2181 server.2=zk2:2888:3888;2181 server.3=0.0.0.0:2888:3888;2181 volumes:- ./zk3/data:/data- ./zk3/datalog:/datalog networks:- hadoop-network
wget https://mirrors.tuna.tsinghua.edu.cn/apache/zookeeper/zookeeper3.8.2/apache-zookeeper-3.8.2-bin.tar.gz --no-check-certificate
启动
[root@cdh1 zookeeper]# docker-compose up -d Creating zk3 ... done Creating zk2 ... done Creating zk1 ... done
5:下载Hadoop部署包
git clone https://gitee.com/hadoop-bigdata/docker-compose-hadoop.git
6:安装部署mysql5.7,这里mysql主要是提供hive存储元数据
cd docker-compose-hadoop/mysql docker-compose -f mysql-compose.yaml up -d docker-compose -f mysql-compose.yaml ps #root 密码:123456,以下是登录命令,注意一般在公司不能直接在命令行明文输入密码,要不然容易被安全抓,切记,切记!!!docker exec -it mysql mysql -uroot -p123456
7:安装hadoop和Hive
cd docker-compose-hadoop/hadoop_hive docker-compose -f docker-compose.yaml up -d # 查看docker-compose -f docker-compose.yaml ps # hive docker exec -it hive-hiveserver2 hive -shoe "show databases"; # hiveserver2 docker exec -it hive-hiveserver2 beeline -u jdbc:hive2://hivehiveserver2:10000 -n hadoop -e "show databases;"
启动后,如果发现hadoop historyserver容器未健康启动,可以执行以下命令:
docker exec -it hadoop-hdfs-nn hdfs dfs -chmod 777 /tmpdocker restart hadoop-mr-historyserver
hdfs格式化可以执行以下命令
[root@cdh1 ~]# docker exec -it hadoop-hdfs-nn hdfs dfsadmin -refreshNodes Refresh nodes successful [root@cdh1 ~]# docker exec -it hadoop-hdfs-dn-0 hdfs dfsadmin -fs hdfs://hadoop-hdfs-nn:9000 -refreshNodes Refresh nodes successful [root@cdh1 ~]# docker exec -it hadoop-hdfs-dn-1 hdfs dfsadmin -fs hdfs://hadoop-hdfs-nn:9000 -refreshNodes Refresh nodes successful [root@cdh1 ~]# docker exec -it hadoop-hdfs-dn-2 hdfs dfsadmin -fs hdfs://hadoop-hdfs-nn:9000 -refreshNodes Refresh nodes successful
可以通过 cdh1:30070 查看HDFS分布情况
以及访问 http://cdh1:30888/cluster 查看yarn资源情况
8. 配置Hbase参数
mkdir conf
conf/hbase-env.sh
export JAVA_HOME=/opt/apache/jdk export HBASE_CLASSPATH=/opt/apache/hbase/conf export HBASE_MANAGES_ZK=false
<configuration> <property> <name>hbase.rootdir</name> <value>hdfs://hadoop-hdfs-nn:9000/hbase</value> <!-- hdfs://ns1/hbase 对应hdfs-site.xml的dfs.nameservices属性值 --> </property> <property> <name>hbase.cluster.distributed</name> <value>true</value> </property> <property> <name>hbase.zookeeper.quorum</name> <value>zk1,zk2,zk3</value> </property> <property> <name>hbase.zookeeper.property.clientPort</name> <value>2181</value> </property> <property> <name>hbase.master</name> <value>60000</value> <description>单机版需要配主机名/IP和端口,HA方式只需要配端口</description> </property> <property> <name>hbase.master.info.bindAddress</name> <value>0.0.0.0</value> </property> <property> <name>hbase.master.port</name> <value>16000</value> </property> <property> <name>hbase.master.info.port</name> <value>16010</value> </property> <property> <name>hbase.regionserver.port</name> <value>16020</value> </property> <property> <name>hbase.regionserver.info.port</name> <value>16030</value> </property> <property> <name>hbase.wal.provider</name> <value>filesystem</value> <!--也可以用multiwal--> </property> </configuration>
conf/backup-masters
hbase-master-2
conf/regionservers
hbase-regionserver-1 hbase-regionserver-2 hbase-regionserver-3
conf/hadoop/core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--><!-- Put site-specific property overrides in this file. --> <configuration> <!--配置namenode的地址 --> <property> <name>fs.defaultFS</name> <value>hdfs://hadoop-hdfs-nn:9000</value> </property> <!-- 文件的缓冲区大小(128KB),默认值是4KB --> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <!-- 文件系统垃圾桶保存时间 --> <property> <name>fs.trash.interval</name> <value>1440</value> </property> <!-- 配置hadoop临时目录,存储元数据用的,请确保该目录(/opt/apache/hadoop/data/hdfs/)已被手动创建,tmp目录会自动创建 --> <property> <name>hadoop.tmp.dir</name> <value>/opt/apache/hadoop/data/hdfs/tmp</value> </property> <!--配置HDFS网页登录使用的静态用户为root--> <property> <name>hadoop.http.staticuser.user</name> <value>root</value> </property> <!--配置root(超级用户)允许通过代理访问的主机节点--> <property> <name>hadoop.proxyuser.root.hosts</name> <value>*</value> </property> <!--配置root(超级用户)允许通过代理用户所属组--> <property> <name>hadoop.proxyuser.root.groups</name> <value>*</value> </property> <!--配置root(超级用户)允许通过代理的用户--> <property> <name>hadoop.proxyuser.root.user</name> <value>*</value> </property> <!--配置hive允许通过代理访问的主机节点--> <property> <name>hadoop.proxyuser.hive.hosts</name> <value>*</value> </property> conf/hadoop/hdfs-site.xml <!--配置hive允许通过代理用户所属组--> <property> <name>hadoop.proxyuser.hive.groups</name> <value>*</value> </property> <!--配置hive允许通过代理访问的主机节点--> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>*</value> </property> <!--配置hive允许通过代理用户所属组--> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>*</value> </property> </configuration>
conf/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file.--> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- namenode web访问配置 --> <property> <name>dfs.namenode.http-address</name> <value>0.0.0.0:9870</value> </property> <!-- 必须将dfs.webhdfs.enabled属性设置为true,否则就不能使用webhdfs的LISTSTATUS、LISTFILESTATUS等需要列出文件、文件夹状态的命令,因为这些信息都是由namenode来保存的。--> <property> <name>dfs.webhdfs.enabled</name> <value>true</value>完成conf配置后,需要设置读写权限 8. 编写环境.env文件 </property> <property> <name>dfs.namenode.name.dir</name> <value>/opt/apache/hadoop/data/hdfs/namenode</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>/opt/apache/hadoop/data/hdfs/datanode/data1,/opt/apache/hadoop/data/h dfs/datanode/data2,/opt/apache/hadoop/data/hdfs/datanode/data3</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> <!-- 设置SNN进程运行机器位置信息 --> <property> <name>dfs.namenode.secondary.http-address</name> <value>hadoop-hdfs-nn2:9868</value> </property> <property> <name>dfs.namenode.datanode.registration.ip-hostnamecheck</name> <value>false</value> </property> <!-- 白名单 --> <property> <name>dfs.hosts</name> <value>/opt/apache/hadoop/etc/hadoop/dfs.hosts</value> </property> <!-- 黑名单 --> <property> <name>dfs.hosts.exclude</name> <value>/opt/apache/hadoop/etc/hadoop/dfs.hosts.exclude</value> </property> </configuration>
完成conf配置后,需要设置读写权限
chmod -R 777 conf/
8. 编写环境.env文件
HBASE_MASTER_PORT=16000 HBASE_MASTER_INFO_PORT=16010 HBASE_HOME=/opt/apache/hbase HBASE_REGIONSERVER_PORT=16020
9. 编排docker-compose.yaml
version: '3' services: hbase-master-1: image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4 user: "hadoop:hadoop" container_name: hbase-master-1 hostname: hbase-master-1 restart: always privileged: true env_file: - .env volumes: - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters - ./conf/regionservers:${HBASE_HOME}/conf/regionservers - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml ports: - "36010:${HBASE_MASTER_PORT}" - "36020:${HBASE_MASTER_INFO_PORT}" command: ["sh","-c","/opt/apache/bootstrap.sh hbase-master"] networks: - hadoop-network healthcheck: test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_MASTER_PORT} || exit 1"] interval: 10s timeout: 20s retries: 3 hbase-master-2: image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4 user: "hadoop:hadoop" container_name: hbase-master-2 hostname: hbase-master-2 restart: always privileged: true env_file: - .env volumes: - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters - ./conf/regionservers:${HBASE_HOME}/conf/regionservers - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml ports: - "36011:${HBASE_MASTER_PORT}" - "36021:${HBASE_MASTER_INFO_PORT}" command: ["sh","-c","/opt/apache/bootstrap.sh hbase-master hbase-master1 ${HBASE_MASTER_PORT}"] networks: - hadoop-network healthcheck: test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_MASTER_PORT} || exit 1"] interval: 10s timeout: 20s retries: 3 hbase-regionserver-1: image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4 user: "hadoop:hadoop" container_name: hbase-regionserver-1 hostname: hbase-regionserver-1 restart: always privileged: true env_file: - .env volumes: - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters - ./conf/regionservers:${HBASE_HOME}/conf/regionservers - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml ports: - "36030:${HBASE_REGIONSERVER_PORT}" command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbasemaster-1 ${HBASE_MASTER_PORT}"] networks: - hadoop-network healthcheck: test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"] interval: 10s timeout: 10s retries: 3 hbase-regionserver-2: image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4 user: "hadoop:hadoop" container_name: hbase-regionserver-2 hostname: hbase-regionserver-2 restart: always privileged: true env_file: - .env volumes: - ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh - ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml - ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters - ./conf/regionservers:${HBASE_HOME}/conf/regionservers - ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml - ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml ports: - "36031:${HBASE_REGIONSERVER_PORT}"command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbasemaster-1 ${HBASE_MASTER_PORT}"] networks:- hadoop-network healthcheck: test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"] interval: 10s timeout: 10s retries: 3 hbase-regionserver-3: image: registry.cn-hangzhou.aliyuncs.com/bigdata_cloudnative/hbase:2.5.4 user: "hadoop:hadoop" container_name: hbase-regionserver-3 hostname: hbase-regionserver-3 restart: always privileged: true env_file:- .env volumes:- ./conf/hbase-env.sh:${HBASE_HOME}/conf/hbase-env.sh- ./conf/hbase-site.xml:${HBASE_HOME}/conf/hbase-site.xml- ./conf/backup-masters:${HBASE_HOME}/conf/backup-masters- ./conf/regionservers:${HBASE_HOME}/conf/regionservers- ./conf/hadoop/core-site.xml:${HBASE_HOME}/conf/core-site.xml- ./conf/hadoop/hdfs-site.xml:${HBASE_HOME}/conf/hdfs-site.xml ports:- "36032:${HBASE_REGIONSERVER_PORT}" command: ["sh","-c","/opt/apache/bootstrap.sh hbase-regionserver hbasemaster-1 ${HBASE_MASTER_PORT}"] networks:- hadoop-network healthcheck: test: ["CMD-SHELL", "netstat -tnlp|grep :${HBASE_REGIONSERVER_PORT} || exit 1"] interval: 10s timeout: 10s retries: 3 # 连接外部网络networks: hadoop-network: external: true
10. 开始部署 当前目录结构如下:
[root@cdh1 hbase]# tree . ├── .env ├── conf ├── backup-masters ├── hadoop ├── core-site.xml └── hdfs-site.xml ├── hbase-env.sh ├── hbase-site.xml └── regionservers ├── docker-compose.yaml
启动:
docker-compose -f docker-compose.yaml up -d # 查看docker-compose -f docker-compose.yaml ps [root@cdh1 hbase]# docker-compose ps Name Command Ports----------------------------------------------------------------------------------------------------------------------------------------------------------------hbase-master-1 State sh -c /opt/apache/bootstra ... Up (healthy) 0.0.0.0:36010->16000/tcp,:::36010->16000/tcp, 0.0.0.0:36020>16010/tcp,:::36020->16010/tcp hbase-master-2 sh -c /opt/apache/bootstra ... Up (healthy) 0.0.0.0:36011->16000/tcp,:::36011->16000/tcp, 0.0.0.0:36021>16010/tcp,:::36021->16010/tcp hbase-regionserver-1 sh -c /opt/apache/bootstra ... Up (healthy) 0.0.0.0:36030->16020/tcp,:::36030->16020/tcp hbase-regionserver-2 sh -c /opt/apache/bootstra ... Up (healthy) 0.0.0.0:36031->16020/tcp,:::36031->16020/tcp hbase-regionserver-3 sh -c /opt/apache/bootstra ... Up (healthy) 0.0.0.0:36032->16020/tcp,:::36032->16020/tcp
通过 Master: hbase-master-1 访问集群信息