Kubernetes-在Kubernetes集群上搭建Hadoop集群

简介: 准备工作Hadoop镜像,到docker hub上拉取docker pull kubeguide/hadoop:latestKubernetes集群参考:Kubernetes-离线部署Kubernetes 1.

准备工作


  1. Hadoop镜像,到docker hub上拉取
docker pull kubeguide/hadoop:latest
  1. Kubernetes集群
    参考:Kubernetes-离线部署Kubernetes 1.9.0

开始搭建


  1. 编写好hadoop.yaml
hadoop.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-hadoop-conf
  namespace: default
data:
  HDFS_MASTER_SERVICE: hadoop-hdfs-master
  HDOOP_YARN_MASTER: hadoop-yarn-master
---
apiVersion: v1
kind: Service
metadata:
  name: hadoop-hdfs-master
spec:
  type: NodePort
  selector:
    app: hdfs-master
  ports:
    - name: rpc
      port: 9000
      targetPort: 9000
    - name: http
      port: 50070
      targetPort: 50070
      nodePort: 32007
---
apiVersion: v1
kind: Pod
metadata:
  name: hdfs-master
  labels:
    app: hdfs-master
spec:
  containers:
    - name: hdfs-master
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 9000
        - containerPort: 50070    
      env:
        - name: HADOOP_NODE_TYPE
          value: namenode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
    name: hadoop-datanode-1
    labels:
      app: hadoop-datanode-1
spec:
  containers:
    - name: hadoop-datanode-1
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 9000
        - containerPort: 50070    
      env:
        - name: HADOOP_NODE_TYPE
          value: datanode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER        
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
    name: hadoop-datanode-2
    labels:
      app: hadoop-datanode-2
spec:
  containers:
    - name: hadoop-datanode-2
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 9000
        - containerPort: 50070    
      env:
        - name: HADOOP_NODE_TYPE
          value: datanode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER        
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
    name: hadoop-datanode-3
    labels:
      app: hadoop-datanode-3
spec:
  containers:
    - name: hadoop-datanode-3
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 9000
        - containerPort: 50070    
      env:
        - name: HADOOP_NODE_TYPE
          value: datanode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER        
  restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: hadoop-yarn-master
spec:
  type: NodePort
  selector:
    app: yarn-master
  ports:
     - name: "8030"       
       port: 8030
     - name: "8031"     
       port: 8031
     - name: "8032"
       port: 8032     
     - name: http
       port: 8088
       targetPort: 8088
       nodePort: 32088
---
apiVersion: v1
kind: Pod
metadata:
  name: yarn-master
  labels:
    app: yarn-master
spec:
  containers:
    - name: yarn-master
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 9000
        - containerPort: 50070    
      env:
        - name: HADOOP_NODE_TYPE
          value: resourceman
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER          
  restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: yarn-node-1
spec:
  clusterIP: None
  selector:
    app: yarn-node-1
  ports:
     - port: 8040
---
apiVersion: v1
kind: Service
metadata:
  name: yarn-node-2
spec:
  clusterIP: None
  selector:
    app: yarn-node-2
  ports:
     - port: 8040
---
apiVersion: v1
kind: Service
metadata:
  name: yarn-node-3
spec:
  clusterIP: None
  selector:
    app: yarn-node-3
  ports:
     - port: 8040
---
apiVersion: v1
kind: Pod
metadata:
  name: yarn-node-1
  labels:
    app: yarn-node-1
spec:
  containers:
    - name: yarn-node-1
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 8040
        - containerPort: 8041   
        - containerPort: 8042        
      env:
        - name: HADOOP_NODE_TYPE
          value: yarnnode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER          
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
  name: yarn-node-2
  labels:
    app: yarn-node-2
spec:
  containers:
    - name: yarn-node-2
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 8040
        - containerPort: 8041   
        - containerPort: 8042        
      env:
        - name: HADOOP_NODE_TYPE
          value: yarnnode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER          
  restartPolicy: Always
---
apiVersion: v1
kind: Pod
metadata:
  name: yarn-node-3
  labels:
    app: yarn-node-3
spec:
  containers:
    - name: yarn-node-3
      image: 192.168.242.132/library/kubernetes-hadoop:latest
      imagePullPolicy: IfNotPresent
      ports:
        - containerPort: 8040
        - containerPort: 8041   
        - containerPort: 8042        
      env:
        - name: HADOOP_NODE_TYPE
          value: yarnnode
        - name: HDFS_MASTER_SERVICE
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDFS_MASTER_SERVICE
        - name: HDOOP_YARN_MASTER
          valueFrom:
            configMapKeyRef:
              name: kube-hadoop-conf
              key: HDOOP_YARN_MASTER          
  restartPolicy: Always

这个yaml文件包含一个ConfigMap,5个Service,8个pod,这里需要注意的是ConfigMap中HDFS_MASTER_SERVICE和HDOOP_YARN_MASTER不要使用IP,使用HDFS service的名称,否则datanode将会连接不上namenode,出现错误【ipc.Client: Retrying connect to server: xxx:9000.】

  1. 执行创建命令
kubectl create -f hadoop.yaml
img_9a789c9c9680d17deff0181eecc36dbd.png
create
  1. 检查是否创建成功
  • 查看config map
kubectl get configmap -o wide
img_0a8e1df53770592890db41ab4cb22f00.png
config map
  • 查看service
kubectl get svc -o wide
img_448f5d9248cf254be7693b5a2bf48224.png
service
  • 查看pod
kubectl get po -o wide
img_08cbcd57d21355d21be684d42980e772.png
pod
  • 通过浏览器访问HDFS管理界面【http://192.168.242.136:32007】
img_bfbd347a19a5770f3bcccff3946a20f8.png
overview

查看datanode

img_ebc353c87140868a4419b477840d3e5a.png
datanodes

全部正常,搭建成功!
但是上面的搭建方式是以单个POD声明的,这种方式不稳定,如果不小心删除后就没有了,其实我们可以使用Replication Controller方式进行搭建,这样的话始终可以确保保留相应数量的POD,具体的yaml文件如下:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-hadoop-conf
  namespace: default
data:
  HDFS_MASTER_SERVICE: hadoop-hdfs-master
  HDOOP_YARN_MASTER: hadoop-yarn-master
---
apiVersion: v1
kind: Service
metadata:
  name: hadoop-hdfs-master
spec:
  type: NodePort
  selector:
    name: hdfs-master
  ports:
    - name: rpc
      port: 9000
      targetPort: 9000
    - name: http
      port: 50070
      targetPort: 50070
      nodePort: 32007
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: hdfs-master
  labels:
    name: hdfs-master
spec:
  replicas: 1
  selector:
    name: hdfs-master
  template:
    metadata:
      labels:
        name: hdfs-master
    spec:
      containers:
        - name: hdfs-master
          image: 10.3.13.184/library/kubernetes-hadoop:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 9000
            - containerPort: 50070    
          env:
            - name: HADOOP_NODE_TYPE
              value: namenode
            - name: HDFS_MASTER_SERVICE
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDFS_MASTER_SERVICE
            - name: HDOOP_YARN_MASTER
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDOOP_YARN_MASTER
      restartPolicy: Always
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: hadoop-datanode
  labels:
    app: hadoop-datanode
spec:
  replicas: 3
  selector:
    name: hadoop-datanode
  template:
    metadata:
      labels:
        name: hadoop-datanode
    spec:
      containers:
        - name: hadoop-datanode
          image: 10.3.13.184/library/kubernetes-hadoop:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 9000
            - containerPort: 50070    
          env:
            - name: HADOOP_NODE_TYPE
              value: datanode
            - name: HDFS_MASTER_SERVICE
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDFS_MASTER_SERVICE
            - name: HDOOP_YARN_MASTER
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDOOP_YARN_MASTER
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: hadoop-yarn-master
spec:
  type: NodePort
  selector:
    name: yarn-master
  ports:
     - name: "8030"       
       port: 8030
     - name: "8031"     
       port: 8031
     - name: "8032"
       port: 8032     
     - name: http
       port: 8088
       targetPort: 8088
       nodePort: 32088
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: yarn-master
  labels:
    name: yarn-master
spec:
  replicas: 1
  selector:
    name: yarn-master
  template:
    metadata:
      labels:
        name: yarn-master
    spec:
      containers:
        - name: yarn-master
          image: 10.3.13.184/library/kubernetes-hadoop:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 9000
            - containerPort: 50070    
          env:
            - name: HADOOP_NODE_TYPE
              value: resourceman
            - name: HDFS_MASTER_SERVICE
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDFS_MASTER_SERVICE
            - name: HDOOP_YARN_MASTER
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDOOP_YARN_MASTER          
      restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
  name: yarn-node
spec:
  clusterIP: None
  selector:
    name: yarn-node
  ports:
     - port: 8040
---
apiVersion: v1
kind: ReplicationController
metadata:
  name: yarn-node
  labels:
    name: yarn-node
spec:
  replicas: 3
  selector:
    name: yarn-node
  template:
    metadata:
      labels:
        name: yarn-node
    spec:
      containers:
        - name: yarn-node
          image: 10.3.13.184/library/kubernetes-hadoop:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8040
            - containerPort: 8041   
            - containerPort: 8042        
          env:
            - name: HADOOP_NODE_TYPE
              value: yarnnode
            - name: HDFS_MASTER_SERVICE
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDFS_MASTER_SERVICE
            - name: HDOOP_YARN_MASTER
              valueFrom:
                configMapKeyRef:
                  name: kube-hadoop-conf
                  key: HDOOP_YARN_MASTER          
      restartPolicy: Always

使用kubectl命令创建

img_9224413ada8166f6b92a0bff0ad74cec.png
创建

创建完成后就可以通过浏览器看到熟悉的HDFS管理界面了

img_bbf3794a014f83e3326d0aa38c6d4d50.png
HDFS
img_3764a3a588572df7329db33298fd30fd.png
image.png

参考:Hadoop 运行在 Kubernetes平台实践

相关实践学习
通过Ingress进行灰度发布
本场景您将运行一个简单的应用,部署一个新的应用用于新的发布,并通过Ingress能力实现灰度发布。
容器应用与集群管理
欢迎来到《容器应用与集群管理》课程,本课程是“云原生容器Clouder认证“系列中的第二阶段。课程将向您介绍与容器集群相关的概念和技术,这些概念和技术可以帮助您了解阿里云容器服务ACK/ACK Serverless的使用。同时,本课程也会向您介绍可以采取的工具、方法和可操作步骤,以帮助您了解如何基于容器服务ACK Serverless构建和管理企业级应用。 学习完本课程后,您将能够: 掌握容器集群、容器编排的基本概念 掌握Kubernetes的基础概念及核心思想 掌握阿里云容器服务ACK/ACK Serverless概念及使用方法 基于容器服务ACK Serverless搭建和管理企业级网站应用
目录
相关文章
|
2月前
|
分布式计算 Kubernetes Hadoop
大数据-82 Spark 集群模式启动、集群架构、集群管理器 Spark的HelloWorld + Hadoop + HDFS
大数据-82 Spark 集群模式启动、集群架构、集群管理器 Spark的HelloWorld + Hadoop + HDFS
177 6
|
22天前
|
Kubernetes 监控 Cloud Native
Kubernetes集群的高可用性与伸缩性实践
Kubernetes集群的高可用性与伸缩性实践
56 1
|
2月前
|
JSON Kubernetes 容灾
ACK One应用分发上线:高效管理多集群应用
ACK One应用分发上线,主要介绍了新能力的使用场景
|
2月前
|
Kubernetes 持续交付 开发工具
ACK One GitOps:ApplicationSet UI简化多集群GitOps应用管理
ACK One GitOps新发布了多集群应用控制台,支持管理Argo CD ApplicationSet,提升大规模应用和集群的多集群GitOps应用分发管理体验。
|
2月前
|
Kubernetes Ubuntu Linux
Centos7 搭建 kubernetes集群
本文介绍了如何搭建一个三节点的Kubernetes集群,包括一个主节点和两个工作节点。各节点运行CentOS 7系统,最低配置为2核CPU、2GB内存和15GB硬盘。详细步骤包括环境配置、安装Docker、关闭防火墙和SELinux、禁用交换分区、安装kubeadm、kubelet、kubectl,以及初始化Kubernetes集群和安装网络插件Calico或Flannel。
194 4
|
2月前
|
分布式计算 Hadoop Shell
Hadoop-35 HBase 集群配置和启动 3节点云服务器 集群效果测试 Shell测试
Hadoop-35 HBase 集群配置和启动 3节点云服务器 集群效果测试 Shell测试
76 4
|
2月前
|
SQL 分布式计算 Hadoop
Hadoop-37 HBase集群 JavaAPI 操作3台云服务器 POM 实现增删改查调用操作 列族信息 扫描全表
Hadoop-37 HBase集群 JavaAPI 操作3台云服务器 POM 实现增删改查调用操作 列族信息 扫描全表
34 3
|
2月前
|
分布式计算 Hadoop Shell
Hadoop-36 HBase 3节点云服务器集群 HBase Shell 增删改查 全程多图详细 列族 row key value filter
Hadoop-36 HBase 3节点云服务器集群 HBase Shell 增删改查 全程多图详细 列族 row key value filter
59 3
|
2月前
|
Kubernetes 应用服务中间件 nginx
搭建Kubernetes v1.31.1服务器集群,采用Calico网络技术
在阿里云服务器上部署k8s集群,一、3台k8s服务器,1个Master节点,2个工作节点,采用Calico网络技术。二、部署nginx服务到k8s集群,并验证nginx服务运行状态。
794 1
|
2月前
|
分布式计算 Java Hadoop
Hadoop-30 ZooKeeper集群 JavaAPI 客户端 POM Java操作ZK 监听节点 监听数据变化 创建节点 删除节点
Hadoop-30 ZooKeeper集群 JavaAPI 客户端 POM Java操作ZK 监听节点 监听数据变化 创建节点 删除节点
67 1