安装k8s Master高可用集群

本文涉及的产品
网络型负载均衡 NLB,每月750个小时 15LCU
应用型负载均衡 ALB,每月750个小时 15LCU
传统型负载均衡 CLB,每月750个小时 15LCU
简介: 安装k8s Master高可用集群 主机 角色 组件 172.18.6.101 K8S Master Kubelet,kubectl,cni,etcd 172.18.6.102 K8S Master Kubelet,kubectl,cni,etcd 172.

安装k8s Master高可用集群

主机	          角色	            组件

172.18.6.101	K8S Master	Kubelet,kubectl,cni,etcd

172.18.6.102	K8S Master	Kubelet,kubectl,cni,etcd

172.18.6.103	K8S Master	Kubelet,kubectl,cni,etcd

172.18.6.104	K8S Worker	Kubelet,cni

172.18.6.105	K8S Worker	Kubelet,cni

172.18.6.106	K8S Worker	Kubelet,cni

etcd安装

保证k8smaster高可用,不建议使用container的方式启动etcd集群,因为container可能会 出现随时死掉的情况,etcd每个节点的启动service又是有状态的。因此此处将以二进制方式进行部署,建议在正式环境中最少部署3个节点的etcd集群,etcd具体安装步骤参考本地服务方式搭建etcd集群

必要组件以及证书安装

ca证书

参考kubernetes中证书生成创建CA证书,并将ca-key.pem与ca.pem放置到k8s集群中所有节点下的/etc/kubernetes/ssl下

woker证书制作

参考kubernetes中证书生成从节点证书生成段落,进行worker节点证书生成。对应ip的证书放置到对应worker节点的/etc/kubernetes/ssl下

kubelet.conf配置安装

创建/etc/kubernetes/kubelet.conf内容如下:

apiVersion: v1

kind: Config

clusters:

- name: local

  cluster:

    server: https://[负载均衡IP]:[apiserver端口]

    certificate-authority: /etc/kubernetes/ssl/ca.pem

users:

- name: kubelet

  user:

    client-certificate: /etc/kubernetes/ssl/worker.pem

    client-key: /etc/kubernetes/ssl/worker-key.pem

contexts:

- context:

    cluster: local

    user: kubelet

  name: kubelet-context

current-context: kubelet-context

cni插件安装

从containernetworking的cni项目中下载cni的必须二进制文件,需要放置到k8s集群中所有节点下的/opt/cni/bin下。

后续将提供rpm包进行一键安装。

kubelet服务部署

注意:后续将提供rpm包进行一键安装。

将对应版本的kubelet二进制文件放置到k8s集群中所有节点下的/usr/bin下

创建/etc/systemd/system/kubelet.service内容如下:

# /etc/systemd/system/kubelet.service

[Unit]

Description=kubelet: The Kubernetes Node Agent

Documentation=http://kubernetes.io/docs/


[Service]

Environment="KUBELET_KUBECONFIG_ARGS=--kubeconfig=/etc/kubernetes/kubelet.conf --require-kubeconfig=true"

Environment="KUBELET_SYSTEM_PODS_ARGS=--pod-manifest-path=/etc/kubernetes/manifests --allow-privileged=true"

Environment="KUBELET_NETWORK_ARGS=--network-plugin=cni --cni-conf-dir=/etc/cni/net.d --cni-bin-dir=/opt/cni/bin"

Environment="KUBELET_DNS_ARGS=--cluster-dns=10.100.0.10 --cluster-domain=cluster.local"

Environment="KUBELET_EXTRA_ARGS=--pod-infra-container-image=registry.aliyuncs.com/shenshouer/pause-amd64:3.0"

ExecStart=

ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_SYSTEM_PODS_ARGS $KUBELET_NETWORK_ARGS $KUBELET_DNS_ARGS $KUBELET_EXTRA_ARGS

Restart=always

StartLimitInterval=0

RestartSec=10


[Install]

WantedBy=multi-user.target

创建如下目录:

/etc/kubernetes/

|-- kubelet.conf

|-- manifests

`-- ssl

   |-- ca-key.pem

   |-- ca.pem

   |-- worker.csr

   |-- worker-key.pem

   |-- worker-openssl.cnf

   `-- worker.pem

master组件安装

配置负载均衡

配置LVS使用VIP172.18.6.254指向后端172.18.6.101、172.18.6.102、172.18.6.103, 如需简单,则可使用nginx进行TCP4层的负载。

证书生成

openssl.cnf内容如下:

[req]

req_extensions = v3_req

distinguished_name = req_distinguished_name

[req_distinguished_name]

[ v3_req ]

basicConstraints = CA:FALSE

keyUsage = nonRepudiation, digitalSignature, keyEncipherment

subjectAltName = @alt_names

[alt_names]

DNS.1 = kubernetes

DNS.2 = kubernetes.default

DNS.3 = kubernetes.default.svc

DNS.4 = kubernetes.default.svc.cluster.local

DNS.5 = test.example.com.cn

IP.1 = 10.96.0.1

IP.2 = 172.18.6.101

IP.3 = 172.18.6.102

IP.3 = 172.18.6.103

IP.4 = 172.18.6.254

# 三个master的IP

IP.2 = 172.18.6.101

IP.3 = 172.18.6.102

IP.3 = 172.18.6.103

# LVS负载均衡的VIP

IP.4 = 172.18.6.254

# 可能会用到的负载均衡domain

DNS.5 = test.example.com.cn

证书生成具体步骤请参考kubernetes中证书生成 Master证书生成部分与Worker证书生成部分,生成后的证书需要放置到三台Master节点对应路径上

其他组件安装

Master节点上/etc/kubernetes/manifests下放置如下三个文件

kube-apiserver.manifest:

# /etc/kubernetes/manifests/kube-apiserver.manifest

{

  "kind": "Pod",

  "apiVersion": "v1",

  "metadata": {

    "name": "kube-apiserver",

    "namespace": "kube-system",

    "creationTimestamp": null,

    "labels": {

      "component": "kube-apiserver",

      "tier": "control-plane"

    }

  },

  "spec": {

    "volumes": [

      {

        "name": "k8s",

        "hostPath": {

          "path": "/etc/kubernetes"

        }

      },

      {

        "name": "certs",

        "hostPath": {

          "path": "/etc/ssl/certs"

        }

      }

    ],

    "containers": [

      {

        "name": "kube-apiserver",

        "image": "registry.aliyuncs.com.cn/shenshouer/kube-apiserver:v1.5.2",

        "command": [

          "kube-apiserver",

          "--insecure-bind-address=127.0.0.1",

          "--admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,PersistentVolumeLabel,DefaultStorageClass,ResourceQuota",

          "--service-cluster-ip-range=10.96.0.0/12",

          "--service-account-key-file=/etc/kubernetes/ssl/apiserver-key.pem",

          "--client-ca-file=/etc/kubernetes/ssl/ca.pem",

          "--tls-cert-file=/etc/kubernetes/ssl/apiserver.pem",

          "--tls-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem",

          "--secure-port=6443",

          "--allow-privileged",

          "--advertise-address=[当前Master节点IP]",

          "--kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname",

          "--anonymous-auth=false",

          "--etcd-servers=http://127.0.0.1:2379"

        ],

        "resources": {

          "requests": {

            "cpu": "250m"

          }

        },

        "volumeMounts": [

          {

            "name": "k8s",

            "readOnly": true,

            "mountPath": "/etc/kubernetes/"

          },

          {

            "name": "certs",

            "mountPath": "/etc/ssl/certs"

          }

        ],

        "livenessProbe": {

          "httpGet": {

            "path": "/healthz",

            "port": 8080,

            "host": "127.0.0.1"

          },

          "initialDelaySeconds": 15,

          "timeoutSeconds": 15,

          "failureThreshold": 8

        }

      }

    ],

    "hostNetwork": true

  },

  "status": {}

}

kube-controller-manager.manifest


{

 "kind": "Pod",

 "apiVersion": "v1",

 "metadata": {

   "name": "kube-controller-manager",

   "namespace": "kube-system",

   "creationTimestamp": null,

   "labels": {

     "component": "kube-controller-manager",

     "tier": "control-plane"

   }

 },

 "spec": {

   "volumes": [

     {

       "name": "k8s",

       "hostPath": {

         "path": "/etc/kubernetes"

       }

     },

     {

       "name": "certs",

       "hostPath": {

         "path": "/etc/ssl/certs"

       }

     }

   ],

   "containers": [

     {

       "name": "kube-controller-manager",

       "image": "registry.aliyuncs.com/shenshouer/kube-controller-manager:v1.5.2",

       "command": [

         "kube-controller-manager",

         "--address=127.0.0.1",

         "--leader-elect",

         "--master=127.0.0.1:8080",

         "--cluster-name=kubernetes",

         "--root-ca-file=/etc/kubernetes/ssl/ca.pem",

         "--service-account-private-key-file=/etc/kubernetes/ssl/apiserver-key.pem",

         "--cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem",

         "--cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem",

         "--insecure-experimental-approve-all-kubelet-csrs-for-group=system:kubelet-bootstrap",

         "--allocate-node-cidrs=true",

         "--cluster-cidr=10.244.0.0/16"

       ],

       "resources": {

         "requests": {

           "cpu": "200m"

         }

       },

       "volumeMounts": [

         {

           "name": "k8s",

           "readOnly": true,

           "mountPath": "/etc/kubernetes/"

         },

         {

           "name": "certs",

           "mountPath": "/etc/ssl/certs"

         }

       ],

       "livenessProbe": {

         "httpGet": {

           "path": "/healthz",

           "port": 10252,

           "host": "127.0.0.1"

         },

         "initialDelaySeconds": 15,

         "timeoutSeconds": 15,

         "failureThreshold": 8

       }

     }

   ],

   "hostNetwork": true

 },

 "status": {}

}



kube-scheduler.manifest


{

 "kind": "Pod",

 "apiVersion": "v1",

 "metadata": {

   "name": "kube-scheduler",

   "namespace": "kube-system",

   "creationTimestamp": null,

   "labels": {

     "component": "kube-scheduler",

     "tier": "control-plane"

   }

 },

 "spec": {

   "containers": [

     {

       "name": "kube-scheduler",

       "image": "registry.aliyuncs.com/shenshouer/kube-scheduler:v1.5.2",

       "command": [

         "kube-scheduler",

         "--address=127.0.0.1",

         "--leader-elect",

         "--master=127.0.0.1:8080"

       ],

       "resources": {

         "requests": {

           "cpu": "100m"

         }

       },

       "livenessProbe": {

         "httpGet": {

           "path": "/healthz",

           "port": 10251,

           "host": "127.0.0.1"

         },

         "initialDelaySeconds": 15,

         "timeoutSeconds": 15,

         "failureThreshold": 8

       }

     }

   ],

   "hostNetwork": true

 },

 "status": {}

其他组件安装

kube-proxy安装

在任意master上执行kubectl create -f kube-proxy-ds.yaml,其中kube-proxy-ds.yaml内容如下:

apiVersion: extensions/v1beta1

kind: DaemonSet

metadata:

  labels:

    component: kube-proxy

    k8s-app: kube-proxy

    kubernetes.io/cluster-service: "true"

    name: kube-proxy

    tier: node

  name: kube-proxy

  namespace: kube-system

spec:

  selector:

    matchLabels:

      component: kube-proxy

      k8s-app: kube-proxy

      kubernetes.io/cluster-service: "true"

      name: kube-proxy

      tier: node

  template:

    metadata:

      labels:

        component: kube-proxy

        k8s-app: kube-proxy

        kubernetes.io/cluster-service: "true"

        name: kube-proxy

        tier: node

    spec:

      containers:

      - command:

        - kube-proxy

        - --kubeconfig=/run/kubeconfig

        - --cluster-cidr=10.244.0.0/16

        image: registry.aliyuncs.com/shenshouer/kube-proxy:v1.5.2

        imagePullPolicy: IfNotPresent

        name: kube-proxy

        resources: {}

        securityContext:

          privileged: true

        terminationMessagePath: /dev/termination-log

        volumeMounts:

        - mountPath: /var/run/dbus

          name: dbus

        - mountPath: /run/kubeconfig

          name: kubeconfig

        - mountPath: /etc/kubernetes/ssl

          name: ssl

      dnsPolicy: ClusterFirst

      hostNetwork: true

      restartPolicy: Always

      securityContext: {}

      terminationGracePeriodSeconds: 30

      volumes:

      - hostPath:

          path: /etc/kubernetes/kubelet.conf

        name: kubeconfig

      - hostPath:

          path: /var/run/dbus

        name: dbus

      - hostPath:

          path: /etc/kubernetes/ssl

        name: ssl

网络组件安装

在任意master上执行kubectl apply -f kube-flannel.yaml,其中kube-flannel.yaml内容如下,注意,如果是在vagrant启动的虚拟机中运行,请修改flannled启动参数将--iface指向具体通讯网卡

---

apiVersion: v1

kind: ServiceAccount

metadata:

  name: flannel

  namespace: kube-system

---

kind: ConfigMap

apiVersion: v1

metadata:

  namespace: kube-system

  name: kube-flannel-cfg

  labels:

    tier: node

    app: flannel

data:

  cni-conf.json: |

    {

      "name": "cbr0",

      "type": "flannel",

      "delegate": {

        "ipMasq": true,

        "bridge": "cbr0",

        "hairpinMode": true,

        "forceAddress": true,

        "isDefaultGateway": true

      }

    }

  net-conf.json: |

    {

      "Network": "10.244.0.0/16",

      "Backend": {

        "Type": "vxlan"

      }

    }

---

apiVersion: extensions/v1beta1

kind: DaemonSet

metadata:

  namespace: kube-system

  name: kube-flannel-ds

  labels:

    tier: node

    app: flannel

spec:

  template:

    metadata:

      labels:

        tier: node

        app: flannel

    spec:

      hostNetwork: true

      nodeSelector:

        beta.kubernetes.io/arch: amd64

      serviceAccountName: flannel

      containers:

      - name: kube-flannel

        image: registry.aliyuncs.com/shenshouer/flannel:v0.7.0

        command: [ "/opt/bin/flanneld", "--ip-masq", "--kube-subnet-mgr", "--iface=eth0" ]

        securityContext:

          privileged: true

        env:

        - name: POD_NAME

          valueFrom:

            fieldRef:

              fieldPath: metadata.name

        - name: POD_NAMESPACE

          valueFrom:

            fieldRef:

              fieldPath: metadata.namespace

        volumeMounts:

        - name: run

          mountPath: /run

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      - name: install-cni

        image: registry.aliyuncs.com/shenshouer/flannel:v0.7.0

        command: [ "/bin/sh", "-c", "set -e -x; cp -f /etc/kube-flannel/cni-conf.json /etc/cni/net.d/10-flannel.conf; while true; do sleep 3600; done" ]

        volumeMounts:

        - name: cni

          mountPath: /etc/cni/net.d

        - name: flannel-cfg

          mountPath: /etc/kube-flannel/

      volumes:

        - name: run

          hostPath:

            path: /run

        - name: cni

          hostPath:

            path: /etc/cni/net.d

        - name: flannel-cfg

          configMap:

            name: kube-flannel-cfg

DNS部署

在任意master上执行kubectl create -f skydns.yaml,其中skydns.yaml内容如下

apiVersion: v1

kind: Service

metadata:

  name: kube-dns

  namespace: kube-system

  labels:

    k8s-app: kube-dns

    kubernetes.io/cluster-service: "true"

    kubernetes.io/name: "KubeDNS"

spec:

  selector:

    k8s-app: kube-dns

  clusterIP: 10.100.0.10

  ports:

  - name: dns

    port: 53

    protocol: UDP

  - name: dns-tcp

    port: 53

    protocol: TCP


---

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

  name: kube-dns

  namespace: kube-system

  labels:

    k8s-app: kube-dns

    kubernetes.io/cluster-service: "true"

spec:

  # replicas: not specified here:

  # 1. In order to make Addon Manager do not reconcile this replicas parameter.

  # 2. Default is 1.

  # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.

  strategy:

    rollingUpdate:

      maxSurge: 10%

      maxUnavailable: 0

  selector:

    matchLabels:

      k8s-app: kube-dns

  template:

    metadata:

      labels:

        k8s-app: kube-dns

      annotations:

        scheduler.alpha.kubernetes.io/critical-pod: ''

        scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'

    spec:

      containers:

      - name: kubedns

        image: registry.aliyuncs.com/shenshouer/kubedns-amd64:1.9

        resources:

          # TODO: Set memory limits when we've profiled the container for large

          # clusters, then set request = limit to keep this container in

          # guaranteed class. Currently, this container falls into the

          # "burstable" category so the kubelet doesn't backoff from restarting it.

          limits:

            memory: 170Mi

          requests:

            cpu: 100m

            memory: 70Mi

        livenessProbe:

          httpGet:

            path: /healthz-kubedns

            port: 8080

            scheme: HTTP

          initialDelaySeconds: 60

          timeoutSeconds: 5

          successThreshold: 1

          failureThreshold: 5

        readinessProbe:

          httpGet:

            path: /readiness

            port: 8081

            scheme: HTTP

          # we poll on pod startup for the Kubernetes master service and

          # only setup the /readiness HTTP server once that's available.

          initialDelaySeconds: 3

          timeoutSeconds: 5

        args:

        - --domain=cluster.local.

        - --dns-port=10053

        - --config-map=kube-dns

        # This should be set to v=2 only after the new image (cut from 1.5) has

        # been released, otherwise we will flood the logs.

        - --v=0

        - --federations=myfederation=federation.test

        env:

        - name: PROMETHEUS_PORT

          value: "10055"

        ports:

        - containerPort: 10053

          name: dns-local

          protocol: UDP

        - containerPort: 10053

          name: dns-tcp-local

          protocol: TCP

        - containerPort: 10055

          name: metrics

          protocol: TCP

      - name: dnsmasq

        image: registry.aliyuncs.com/shenshouer/kube-dnsmasq-amd64:1.4

        livenessProbe:

          httpGet:

            path: /healthz-dnsmasq

            port: 8080

            scheme: HTTP

          initialDelaySeconds: 60

          timeoutSeconds: 5

          successThreshold: 1

          failureThreshold: 5

        args:

        - --cache-size=1000

        - --no-resolv

        - --server=127.0.0.1#10053

        - --log-facility=-

        ports:

        - containerPort: 53

          name: dns

          protocol: UDP

        - containerPort: 53

          name: dns-tcp

          protocol: TCP

        # see: https://github.com/kubernetes/kubernetes/issues/29055 for details

        resources:

          requests:

            cpu: 150m

            memory: 10Mi

      - name: dnsmasq-metrics

        image: registry.aliyuncs.com/shenshouer/dnsmasq-metrics-amd64:1.0

        livenessProbe:

          httpGet:

            path: /metrics

            port: 10054

            scheme: HTTP

          initialDelaySeconds: 60

          timeoutSeconds: 5

          successThreshold: 1

          failureThreshold: 5

        args:

        - --v=2

        - --logtostderr

        ports:

        - containerPort: 10054

          name: metrics

          protocol: TCP

        resources:

          requests:

            memory: 10Mi

      - name: healthz

        image: registry.aliyuncs.com/shenshouer/exechealthz-amd64:1.2

        resources:

          limits:

            memory: 50Mi

          requests:

            cpu: 10m

            # Note that this container shouldn't really need 50Mi of memory. The

            # limits are set higher than expected pending investigation on #29688.

            # The extra memory was stolen from the kubedns container to keep the

            # net memory requested by the pod constant.

            memory: 50Mi

        args:

        - --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null

        - --url=/healthz-dnsmasq

        - --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null

        - --url=/healthz-kubedns

        - --port=8080

        - --quiet

        ports:

        - containerPort: 8080

          protocol: TCP

      dnsPolicy: Default  # Don't use cluster DNS.

Node节点安装

Docker安装

新建/etc/kubernetes/目录


|-- kubelet.conf

|-- manifests

`-- ssl

  |-- ca-key.pem

  |-- ca.pem

  |-- ca.srl

  |-- worker.csr

  |-- worker-key.pem

  |-- worker-openssl.cnf

  `-- worker.pem

新建/etc/kubernetes/kubelet.conf配置,参考kubelet.conf配置

新建/etc/kubernetes/ssl,证书制作参考worker证书制作

新建/etc/kubernetes/manifests

新建/opt/cni/bin,安装CNI参考cni安装步骤

安装kubelet,参考kubelet安装

systemctl enable kubelet && systemctl restart kubelet && journalctl -fu kubelet

本文转自CSDN-安装k8s Master高可用集群

相关实践学习
深入解析Docker容器化技术
Docker是一个开源的应用容器引擎,让开发者可以打包他们的应用以及依赖包到一个可移植的容器中,然后发布到任何流行的Linux机器上,也可以实现虚拟化,容器是完全使用沙箱机制,相互之间不会有任何接口。Docker是世界领先的软件容器平台。开发人员利用Docker可以消除协作编码时“在我的机器上可正常工作”的问题。运维人员利用Docker可以在隔离容器中并行运行和管理应用,获得更好的计算密度。企业利用Docker可以构建敏捷的软件交付管道,以更快的速度、更高的安全性和可靠的信誉为Linux和Windows Server应用发布新功能。 在本套课程中,我们将全面的讲解Docker技术栈,从环境安装到容器、镜像操作以及生产环境如何部署开发的微服务应用。本课程由黑马程序员提供。     相关的阿里云产品:容器服务 ACK 容器服务 Kubernetes 版(简称 ACK)提供高性能可伸缩的容器应用管理能力,支持企业级容器化应用的全生命周期管理。整合阿里云虚拟化、存储、网络和安全能力,打造云端最佳容器化应用运行环境。 了解产品详情: https://www.aliyun.com/product/kubernetes
相关文章
|
4月前
|
资源调度 Kubernetes 调度
从单集群到多集群的快速无损转型:ACK One 多集群应用分发
本文介绍如何利用阿里云的分布式云容器平台ACK One的多集群应用分发功能,结合云效CD能力,快速将单集群CD系统升级为多集群CD系统。通过增加分发策略(PropagationPolicy)和差异化策略(OverridePolicy),并修改单集群kubeconfig为舰队kubeconfig,可实现无损改造。该方案具备多地域多集群智能资源调度、重调度及故障迁移等能力,帮助用户提升业务效率与可靠性。
|
4月前
|
资源调度 Kubernetes 调度
从单集群到多集群的快速无损转型:ACK One 多集群应用分发
ACK One 的多集群应用分发,可以最小成本地结合您已有的单集群 CD 系统,无需对原先应用资源 YAML 进行修改,即可快速构建成多集群的 CD 系统,并同时获得强大的多集群资源调度和分发的能力。
129 9
|
6月前
|
存储 Kubernetes 监控
K8s集群实战:使用kubeadm和kuboard部署Kubernetes集群
总之,使用kubeadm和kuboard部署K8s集群就像回归童年一样,简单又有趣。不要忘记,技术是为人服务的,用K8s集群操控云端资源,我们不过是想在复杂的世界找寻简单。尽管部署过程可能遇到困难,但朝着简化复杂的目标,我们就能找到意义和乐趣。希望你也能利用这些工具,找到你的乐趣,满足你的需求。
538 33
|
6月前
|
Kubernetes 开发者 Docker
集群部署:使用Rancher部署Kubernetes集群。
以上就是使用 Rancher 部署 Kubernetes 集群的流程。使用 Rancher 和 Kubernetes,开发者可以受益于灵活性和可扩展性,允许他们在多种环境中运行多种应用,同时利用自动化工具使工作负载更加高效。
315 19
|
6月前
|
人工智能 分布式计算 调度
打破资源边界、告别资源浪费:ACK One 多集群Spark和AI作业调度
ACK One多集群Spark作业调度,可以帮助您在不影响集群中正在运行的在线业务的前提下,打破资源边界,根据各集群实际剩余资源来进行调度,最大化您多集群中闲置资源的利用率。
|
9月前
|
Prometheus Kubernetes 监控
OpenAI故障复盘 - 阿里云容器服务与可观测产品如何保障大规模K8s集群稳定性
聚焦近日OpenAI的大规模K8s集群故障,介绍阿里云容器服务与可观测团队在大规模K8s场景下我们的建设与沉淀。以及分享对类似故障问题的应对方案:包括在K8s和Prometheus的高可用架构设计方面、事前事后的稳定性保障体系方面。
|
6月前
|
Prometheus Kubernetes 监控
OpenAI故障复盘丨如何保障大规模K8s集群稳定性
OpenAI故障复盘丨如何保障大规模K8s集群稳定性
193 0
OpenAI故障复盘丨如何保障大规模K8s集群稳定性
|
8月前
|
缓存 容灾 网络协议
ACK One多集群网关:实现高效容灾方案
ACK One多集群网关可以帮助您快速构建同城跨AZ多活容灾系统、混合云同城跨AZ多活容灾系统,以及异地容灾系统。
|
7月前
|
运维 分布式计算 Kubernetes
ACK One多集群Service帮助大批量应用跨集群无缝迁移
ACK One多集群Service可以帮助您,在无需关注服务间的依赖,和最小化迁移风险的前提下,完成跨集群无缝迁移大批量应用。
|
9月前
|
Kubernetes Ubuntu 网络安全
ubuntu使用kubeadm搭建k8s集群
通过以上步骤,您可以在 Ubuntu 系统上使用 kubeadm 成功搭建一个 Kubernetes 集群。本文详细介绍了从环境准备、安装 Kubernetes 组件、初始化集群到管理和使用集群的完整过程,希望对您有所帮助。在实际应用中,您可以根据具体需求调整配置,进一步优化集群性能和安全性。
577 13

推荐镜像

更多