Kubernetes-在Kubernetes集群上搭建Stateful Elasticsearch集群

简介: 准备工作Elasticsearch镜像,我以Elasticsearch官方镜像的5.6.10版本为基础创建的。DockerfileFROM docker.

准备工作


  1. Elasticsearch镜像,我以Elasticsearch官方镜像的5.6.10版本为基础创建的。
Dockerfile
FROM docker.elastic.co/elasticsearch/elasticsearch:5.6.10
MAINTAINER leo.lee(lis85@163.com)
WORKDIR /usr/share/elasticsearch
USER root
# copying custom-entrypoint.sh and configuration (elasticsearch.yml, log4j2.properties)
# to their respective directories in /usr/share/elasticsearch (already the WORKDIR)
COPY custom-entrypoint.sh bin/
COPY elasticsearch.yml config/
COPY log4j2.properties config/

# assuring "elasticsearch" user have appropriate access to configuration and custom-entrypoint.sh
# make sure custom-entrypoint.sh is executable
RUN chown elasticsearch:elasticsearch config/elasticsearch.yml config/log4j2.properties bin/custom-entrypoint.sh && \
    chmod +x bin/custom-entrypoint.sh

# start by running the custom entrypoint (as root)
CMD ["/bin/bash", "bin/custom-entrypoint.sh"]
custom-entrypoint.sh
#!/bin/bash
# This is expected to run as root for setting the ulimits

set -e
##################################################################################
# ensure increased ulimits - for nofile - for the Elasticsearch containers
# the limit on the number of files that a single process can have open at a time (default is 1024)
ulimit -n 65536

# ensure increased ulimits - for nproc - for the Elasticsearch containers
# the limit on the number of processes that elasticsearch can create
# 2048 is min to pass the linux checks (default is 50)
# https://www.elastic.co/guide/en/elasticsearch/reference/current/max-number-threads-check.html
ulimit -u 2048

# swapping needs to be disabled for performance and node stability
# in ElasticSearch config we are using: [bootstrap.memory_lock=true]
# this additionally requires the "memlock: true" ulimit; specifically set for each container
# -l: max locked memory 
ulimit -l unlimited

# running command to start elasticsearch
# passing all inputs of this entry point script to the es-docker startup script
# NOTE: this entry point script is run as root; but executes the es-docker
# startup script as the elasticsearch user, passing all the root environment-variables 
# to the elasticsearch user 
su elasticsearch bin/es-docker "$@"
elasticsearch.yml
# attaching the namespace to the cluster.name to differentiate different clusters
# ex. elasticsearh-acceptance, elasticsearh-production, elasticsearh-monitoring
cluster.name: "elasticsearch-${NAMESPACE}"

# we provide a node.name that is the POD_NAME-NAMESPACE
# ex. elasticsearh-0-acceptance, elasticsearh-1-acceptance, elasticsearh-2-acceptance
node.name: "${POD_NAME}-${NAMESPACE}"

network.host: ${POD_IP}

# A hostname that resolves to multiple IP addresses will try all resolved addresses 
# we provide the name for the headless service 
# which resolves to the ip addresses of all the live attached pods
# alternatively we can directly reference the hostnames of the pods
discovery.zen.ping.unicast.hosts: es-discovery-svc

# minimum_master_nodes need to be explicitly set when bound on a public IP
# set to 1 to allow single node clusters
# more info: https://github.com/elastic/elasticsearch/pull/17288
discovery.zen.minimum_master_nodes: 2

bootstrap.memory_lock: true

#-------------------------------------------------------------------------------------
# RECOVERY: https://www.elastic.co/guide/en/elasticsearch/guide/current/important-configuration-changes.html
# SETTINGS TO avoid the excessive shard swapping that can occur on cluster restarts
#-------------------------------------------------------------------------------------
# how many nodes shall be present to consider the cluster functional;
# prevents Elasticsearch from starting recovery until these nodes are available
gateway.recover_after_nodes: 2

# how many nodes are expected in the cluster
gateway.expected_nodes: 3

# how long we want to wait after [gateway.recover_after_nodes] is reached in order to start recovery process (if applicable). 
gateway.recover_after_time: 5m
#-------------------------------------------------------------------------------------

# The following settings control the fault detection process using the discovery.zen.fd prefix:
# How often a node gets pinged. Defaults to 1s.
discovery.zen.fd.ping_interval: 1s

# How long to wait for a ping response, defaults to 30s.
discovery.zen.fd.ping_timeout: 10s

# How many ping failures / timeouts cause a node to be considered failed. Defaults to 3.
discovery.zen.fd.ping_retries: 2
log4j2.properties
status = error
appender.console.type = Console
appender.console.name = console
appender.console.layout.type = PatternLayout
appender.console.layout.pattern = [%d{ISO8601}][%-5p][%-25c{1.}] %marker%m%n
rootLogger.level = info
rootLogger.appenderRef.console.ref = console

将这4个文件放入同一级目录下,然后在该目录下使用命令创建镜像

docker build -t [image name]:[version] .

创建完成后将镜像上传到私有镜像仓库中。

  1. ES需要用到存储,需要提前创建持久卷(PV)
persistent-volume-es.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
  name: k8s-pv-es1
  labels:
    type: local
spec:
  storageClassName: gce-standard-sc
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/usr/share/elasticsearch/data"
  persistentVolumeReclaimPolicy: Recycle
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: k8s-pv-es2
  labels:
    type: local
spec:
  storageClassName: gce-standard-sc
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/usr/share/elasticsearch/data"
  persistentVolumeReclaimPolicy: Recycle
---
kind: PersistentVolume
apiVersion: v1
metadata:
  name: k8s-pv-es3  
  labels:
    type: local
spec:
  storageClassName: gce-standard-sc
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/usr/share/elasticsearch/data"
  persistentVolumeReclaimPolicy: Recycle

使用如下命令创建

kubectl create -f persistent-volume-es.yaml
img_66519e9ffdcc7c37d39318e739c1d2b3.png
create pv

部署Elasticsearch集群


elasticsearch.yaml
#create the statefulset headless service
apiVersion: v1
kind: Service
metadata:
  name: es-discovery-svc
  labels:
    app: es-discovery-svc
spec:
  # the set of Pods targeted by this Service are determined by the Label Selector
  selector:
    app: elasticsearch
  # exposing elasticsearch transport port (only)
  # this service will be used by es-nodes for discovery;
  # communication between es-nodes happens through 
  # the transport port (9300)
  ports:
  - protocol: TCP
    # port exposed by the service (service reacheable at)
    port: 9300
    # port exposed by the Pod(s) the service abstracts (pod reacheable at) 
    # can be a string representing the name of the port @the pod (ex. transport)
    targetPort: 9300
    name: transport
  # specifying this is a headless service by providing ClusterIp "None"
  clusterIP: None
---
#create the cluster-ip service
apiVersion: v1
kind: Service
metadata:
  name: es-ia-svc
  labels:
    app: es-ia-svc
spec:
  selector:
    app: elasticsearch
  ports:
  - name: http
    port: 9200
    protocol: TCP
  - name: transport
    port: 9300
    protocol: TCP
---
#create the stateful set
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
  name: elasticsearch
  labels:
    app: elasticsearch
spec:
  # the headless-service that governs this StatefulSet
  # responsible for the network identity of the set.
  serviceName: es-discovery-svc
  replicas: 3
  # Template is the object that describes the pod that will be created
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      securityContext:
        # allows read/write access for mounted volumes  
        # by users that belong to a group with gid: 1000
        fsGroup: 1000
      initContainers:
      # init-container for setting the mmap count limit
      - name: sysctl
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      containers:
      - name: elasticsearch
        securityContext:
          # applying fix in: https://github.com/kubernetes/kubernetes/issues/3595#issuecomment-287692878 
          # https://docs.docker.com/engine/reference/run/#operator-exclusive-options
          capabilities:
            add:
              # Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2))
              - IPC_LOCK  
              # Override resource Limits
              - SYS_RESOURCE
        image: registry.docker.uih/library/leo-elsticsearch:5.6.10
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 9300
          name: transport
          protocol: TCP
        - containerPort: 9200
          name: http
          protocol: TCP
        env:
        # environment variables to be directly refrenced from the configuration
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        # elasticsearch heapsize (to be adjusted based on need)
        - name: "ES_JAVA_OPTS"
          value: "-Xms2g -Xmx2g"
        # mounting storage persistent volume completely on the data dir
        volumeMounts:
        - name: es-data-vc
          mountPath: /usr/share/elasticsearch/data
  # The StatefulSet guarantees that a given [POD] network identity will 
  # always map to the same storage identity
  volumeClaimTemplates:
  - metadata:
      name: es-data-vc
    spec:
      accessModes: [ "ReadWriteOnce" ]
      resources:
        requests:
          # elasticsearch mounted data directory size (to be adjusted based on need)
          storage: 20Gi
      storageClassName: gce-standard-sc
      # no LabelSelector  defined
      # claims can specify a label selector to further filter the set of volumes
      # currently, a PVC with a non-empty selector can not have a PV dynamically provisioned for it
      # no volumeName is provided

使用如下命令进行部署

kubectl create -f elasticsearch.yaml

部署完后发现还没运行起来,通过日志查出是用户对文件夹【/usr/share/elasticsearch】没有权限引起的,文件夹的权限是root用户,这里可以在外面通过root用户手动修改文件夹权限,将文件夹权限赋给普通用户即可。

相关实践学习
深入解析Docker容器化技术
Docker是一个开源的应用容器引擎,让开发者可以打包他们的应用以及依赖包到一个可移植的容器中,然后发布到任何流行的Linux机器上,也可以实现虚拟化,容器是完全使用沙箱机制,相互之间不会有任何接口。Docker是世界领先的软件容器平台。开发人员利用Docker可以消除协作编码时“在我的机器上可正常工作”的问题。运维人员利用Docker可以在隔离容器中并行运行和管理应用,获得更好的计算密度。企业利用Docker可以构建敏捷的软件交付管道,以更快的速度、更高的安全性和可靠的信誉为Linux和Windows Server应用发布新功能。 在本套课程中,我们将全面的讲解Docker技术栈,从环境安装到容器、镜像操作以及生产环境如何部署开发的微服务应用。本课程由黑马程序员提供。     相关的阿里云产品:容器服务 ACK 容器服务 Kubernetes 版(简称 ACK)提供高性能可伸缩的容器应用管理能力,支持企业级容器化应用的全生命周期管理。整合阿里云虚拟化、存储、网络和安全能力,打造云端最佳容器化应用运行环境。 了解产品详情: https://www.aliyun.com/product/kubernetes
目录
相关文章
|
6月前
|
人工智能 算法 调度
阿里云ACK托管集群Pro版共享GPU调度操作指南
本文介绍在阿里云ACK托管集群Pro版中,如何通过共享GPU调度实现显存与算力的精细化分配,涵盖前提条件、使用限制、节点池配置及任务部署全流程,提升GPU资源利用率,适用于AI训练与推理场景。
580 2
|
6月前
|
弹性计算 监控 调度
ACK One 注册集群云端节点池升级:IDC 集群一键接入云端 GPU 算力,接入效率提升 80%
ACK One注册集群节点池实现“一键接入”,免去手动编写脚本与GPU驱动安装,支持自动扩缩容与多场景调度,大幅提升K8s集群管理效率。
391 89
|
11月前
|
资源调度 Kubernetes 调度
从单集群到多集群的快速无损转型:ACK One 多集群应用分发
ACK One 的多集群应用分发,可以最小成本地结合您已有的单集群 CD 系统,无需对原先应用资源 YAML 进行修改,即可快速构建成多集群的 CD 系统,并同时获得强大的多集群资源调度和分发的能力。
776 9
|
11月前
|
资源调度 Kubernetes 调度
从单集群到多集群的快速无损转型:ACK One 多集群应用分发
本文介绍如何利用阿里云的分布式云容器平台ACK One的多集群应用分发功能,结合云效CD能力,快速将单集群CD系统升级为多集群CD系统。通过增加分发策略(PropagationPolicy)和差异化策略(OverridePolicy),并修改单集群kubeconfig为舰队kubeconfig,可实现无损改造。该方案具备多地域多集群智能资源调度、重调度及故障迁移等能力,帮助用户提升业务效率与可靠性。
|
存储 Kubernetes 监控
K8s集群实战:使用kubeadm和kuboard部署Kubernetes集群
总之,使用kubeadm和kuboard部署K8s集群就像回归童年一样,简单又有趣。不要忘记,技术是为人服务的,用K8s集群操控云端资源,我们不过是想在复杂的世界找寻简单。尽管部署过程可能遇到困难,但朝着简化复杂的目标,我们就能找到意义和乐趣。希望你也能利用这些工具,找到你的乐趣,满足你的需求。
1122 33
|
Java Linux
CentOS环境搭建Elasticsearch集群
至此,您已成功在CentOS环境下搭建了Elasticsearch集群。通过以上介绍和步骤,相信您对部署Elasticsearch集群有了充分的了解。最后祝您在使用Elasticsearch集群的过程中顺利开展工作!
598 22
|
Kubernetes 开发者 Docker
集群部署:使用Rancher部署Kubernetes集群。
以上就是使用 Rancher 部署 Kubernetes 集群的流程。使用 Rancher 和 Kubernetes,开发者可以受益于灵活性和可扩展性,允许他们在多种环境中运行多种应用,同时利用自动化工具使工作负载更加高效。
712 19
|
人工智能 分布式计算 调度
打破资源边界、告别资源浪费:ACK One 多集群Spark和AI作业调度
ACK One多集群Spark作业调度,可以帮助您在不影响集群中正在运行的在线业务的前提下,打破资源边界,根据各集群实际剩余资源来进行调度,最大化您多集群中闲置资源的利用率。
|
Prometheus Kubernetes 监控
OpenAI故障复盘丨如何保障大规模K8s集群稳定性
OpenAI故障复盘丨如何保障大规模K8s集群稳定性
518 0
OpenAI故障复盘丨如何保障大规模K8s集群稳定性
|
运维 分布式计算 Kubernetes
ACK One多集群Service帮助大批量应用跨集群无缝迁移
ACK One多集群Service可以帮助您,在无需关注服务间的依赖,和最小化迁移风险的前提下,完成跨集群无缝迁移大批量应用。

推荐镜像

更多