kubeadm搭建的K8S集群升级

简介: kubeadm搭建的K8S集群升级

升级说明


  • 可用的K8S集群,使用kubeadm搭建
  • 可以小版本升级,也可以跨一个大版本升级,不建议跨两个大版本升级
  • 对集群资源做好备份


升级目标


将kubernetes 1.17.9版本升级到1.18.9版本


现有集群版本已经节点如下:


# kubectl get nodes 
NAME            STATUS                     ROLES    AGE    VERSION
ecs-968f-0005   Ready                      node     102d   v1.17.9
k8s-master      Ready                      master   102d   v1.17.9


备份集群


kubeadm upgrade 不会影响你的工作负载,只会涉及 Kubernetes 内部的组件,但备份终究是好的。这里主要是对集群的所有资源进行备份,我使用的是一个开源的脚本,项目地址是:https://github.com/solomonxu/k8s-backup-restore


(1)下载脚本


$ mkdir -p /data
cd /data
git clone https://github.com/solomonxu/k8s-backup-restore.git


(2)执行备份


cd /data/k8s-backup-restore
./bin/k8s_backup.sh


如果要恢复怎么办?只需要执行如下步骤。


(1)创建恢复目录


$ mkdir -p /data/k8s-backup-restore/data/restore


(2)将需要恢复的YAML清单复制到该目录下


$ cp devops_deployments_gitlab.yaml ../../restore/


(3)执行恢复命令


cd /data/k8s-backup-restore
./bin/k8s_restore.sh


会输出如下信息。


2021-01-06 15:09:43.954083 [11623] - INFO Kubernetes Restore start now. All yaml files which located in path [/data/k8s-backup-restore/data/restore] will be applied.
2021-01-06 15:09:43.957265 [11623] - INFO If you want to read the log record of restore, please input command ' tail -100f '
2021-01-06 15:09:43.986869 [11623] - WARN WARNING!!! This will create 1 resources from yaml files into kubernetes cluster. While same name of resources will be deleted. Please consider it carefully!
Do you want to continue? [yes/no/show] y
2021-01-06 15:10:00.062598 [11623] - INFO Restore No.1 resources from yaml file: /data/k8s-backup-restore/data/restore/devops_deployments_gitlab.yaml...
2021-01-06 15:10:00.066011 [11623] - INFO Run shell: kubectl delete -f /data/k8s-backup-restore/data/restore/devops_deployments_gitlab.yaml.
deployment.apps "gitlab" deleted
2021-01-06 15:10:00.423109 [11623] - INFO Delete resource from /data/k8s-backup-restore/data/restore/devops_deployments_gitlab.yaml: ok.
2021-01-06 15:10:00.426383 [11623] - INFO Run shell: kubectl create -f /data/k8s-backup-restore/data/restore/devops_deployments_gitlab.yaml.
deployment.apps/gitlab created
2021-01-06 15:10:00.614960 [11623] - INFO Create resource from /data/k8s-backup-restore/data/restore/devops_deployments_gitlab.yaml: ok.
2021-01-06 15:10:00.618572 [11623] - INFO Restore 1 resources from yaml files in all: count_delete_ok=1, count_delete_failed=0, count_create_ok=1, count_create_failed=0.
2021-01-06 15:10:00.622002 [11623] - INFO Kubernetes Restore completed, all done.


(4)验证是否正常修复


$ kubectl get po -n devops
NAME                              READY   STATUS    RESTARTS   AGE
gitlab-65896f7557-786hj           1/1     Running   0          66s


升级集群


Master升级


(1)确定要升级的版本


$ yum list --showduplicates kubeadm --disableexcludes=kubernetes


我这里选择的是1.18.9版本。


(2)升级kubeadm


$ yum install -y kubeadm-1.18.9-0 --disableexcludes=kubernetes


升级完成后验证版本是否正确。


$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.9", GitCommit:"94f372e501c973a7fa9eb40ec9ebd2fe7ca69848", GitTreeState:"clean", BuildDate:"2020-09-16T13:54:01Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}


(3)排空节点


$ kubectl cordon k8s-master$ kubectl drain k8s-master


(4)运行升级计划,查看是否可以升级


$ kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.17.9
[upgrade/versions] kubeadm version: v1.18.9
I0106 14:22:58.709642   10455 version.go:252] remote version is much newer: v1.20.1; falling back to: stable-1.18
[upgrade/versions] Latest stable version: v1.18.14
[upgrade/versions] Latest stable version: v1.18.14
[upgrade/versions] Latest version in the v1.17 series: v1.17.16
[upgrade/versions] Latest version in the v1.17 series: v1.17.16
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT       AVAILABLE
Kubelet     2 x v1.17.9   v1.17.16
Upgrade to the latest version in the v1.17 series:
COMPONENT            CURRENT   AVAILABLE
API Server           v1.17.9   v1.17.16
Controller Manager   v1.17.9   v1.17.16
Scheduler            v1.17.9   v1.17.16
Kube Proxy           v1.17.9   v1.17.16
CoreDNS              1.6.5     1.6.7
Etcd                 3.4.3     3.4.3-0
You can now apply the upgrade by executing the following command:
    kubeadm upgrade apply v1.17.16
_____________________________________________________________________
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT       AVAILABLE
Kubelet     2 x v1.17.9   v1.18.14
Upgrade to the latest stable version:
COMPONENT            CURRENT   AVAILABLE
API Server           v1.17.9   v1.18.14
Controller Manager   v1.17.9   v1.18.14
Scheduler            v1.17.9   v1.18.14
Kube Proxy           v1.17.9   v1.18.14
CoreDNS              1.6.5     1.6.7
Etcd                 3.4.3     3.4.3-0
You can now apply the upgrade by executing the following command:
    kubeadm upgrade apply v1.18.14
Note: Before you can perform this upgrade, you have to update kubeadm to v1.18.14.
_____________________________________________________________________


上面显示我可以升级到更高版本,不过我这里还是升级到1.18.9。


(5)升级集群


$ kubeadm upgrade apply v1.18.9 --config kubeadm.yaml 
W0106 14:23:58.359112   11936 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[upgrade/config] Making sure the configuration is correct:
W0106 14:23:58.367062   11936 common.go:94] WARNING: Usage of the --config flag for reconfiguring the cluster during upgrade is not recommended!
W0106 14:23:58.367816   11936 configset.go:202] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.18.9"
[upgrade/versions] Cluster version: v1.17.9
[upgrade/versions] kubeadm version: v1.18.9
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Will prepull images for components [kube-apiserver kube-controller-manager kube-scheduler etcd]
[upgrade/prepull] Prepulling image for component etcd.
[upgrade/prepull] Prepulling image for component kube-controller-manager.
[upgrade/prepull] Prepulling image for component kube-scheduler.
[upgrade/prepull] Prepulling image for component kube-apiserver.
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-controller-manager
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-apiserver
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-kube-scheduler
[apiclient] Found 0 Pods for label selector k8s-app=upgrade-prepull-etcd
[apiclient] Found 1 Pods for label selector k8s-app=upgrade-prepull-etcd
[upgrade/prepull] Prepulled image for component etcd.
[upgrade/prepull] Prepulled image for component kube-controller-manager.
[upgrade/prepull] Prepulled image for component kube-apiserver.
[upgrade/prepull] Prepulled image for component kube-scheduler.
[upgrade/prepull] Successfully prepulled the images for all the control plane components
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.18.9"...
Static pod: kube-apiserver-k8s-master hash: d002f0455950f5b76f6097191f93db28
Static pod: kube-controller-manager-k8s-master hash: 54e96591b22cec4a1f5b76965fa90be7
Static pod: kube-scheduler-k8s-master hash: da215ebee0354225c20c7bdf28b467f8
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/etcd] Non fatal issue encountered during upgrade: the desired etcd version for this Kubernetes version "v1.18.9" is "3.4.3-0", but the current etcd version is "3.4.3". Won't downgrade etcd, instead just continue
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests328986264"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-01-06-14-24-18/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-k8s-master hash: d002f0455950f5b76f6097191f93db28
Static pod: kube-apiserver-k8s-master hash: d002f0455950f5b76f6097191f93db28
Static pod: kube-apiserver-k8s-master hash: 6bc4f16364bf23910ec81c9e91593d95
[apiclient] Found 1 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-01-06-14-24-18/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-k8s-master hash: 54e96591b22cec4a1f5b76965fa90be7
Static pod: kube-controller-manager-k8s-master hash: a96ac50aab8a064c2101f684d34ee058
[apiclient] Found 1 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-01-06-14-24-18/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-k8s-master hash: da215ebee0354225c20c7bdf28b467f8
Static pod: kube-scheduler-k8s-master hash: 1a0670b7d3bff3fd96dbd08f176c1461
[apiclient] Found 1 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.18" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.18.9". Enjoy!
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.


由输出可以看出升级执行成功。


(6)取消调度保护


# kubectl uncordon k8s-master


(7)升级节点


$ kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Upgrading your Static Pod-hosted control plane instance to version "v1.18.9"...
Static pod: kube-apiserver-k8s-master hash: 6bc4f16364bf23910ec81c9e91593d95
Static pod: kube-controller-manager-k8s-master hash: a96ac50aab8a064c2101f684d34ee058
Static pod: kube-scheduler-k8s-master hash: 1a0670b7d3bff3fd96dbd08f176c1461
[upgrade/etcd] Upgrading to TLS for etcd
[upgrade/etcd] Non fatal issue encountered during upgrade: the desired etcd version for this Kubernetes version "v1.18.9" is "3.4.3-0", but the current etcd version is "3.4.3". Won't downgrade etcd, instead just continue
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests315032619"
W0106 14:36:33.013476   30507 manifests.go:225] the default kube-apiserver authorization-mode is "Node,RBAC"; using "Node,RBAC"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Current and new manifests of kube-apiserver are equal, skipping upgrade
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Current and new manifests of kube-controller-manager are equal, skipping upgrade
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Current and new manifests of kube-scheduler are equal, skipping upgrade
[upgrade] The control plane instance for this node was successfully updated!
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.


(8)升级kubectl和kubelet


$ yum install -y kubelet-1.18.9-0 kubectl-1.18.9-0 --disableexcludes=kubernetes


重启kubelet


$ systemctl daemon-reload
$ systemctl restart kubelet


Node升级


(1)升级kubeadm


yum install -y kubeadm-1.18.9-0 --disableexcludes=kubernetes


(2)设置节点不可调度并排空节点


$ kubectl cordon ecs-968f-0005
$ kubectl drain ecs-968f-0005


(3)升级节点


$ kubeadm upgrade node
[upgrade] Reading configuration from the cluster...
[upgrade] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[upgrade] Skipping phase. Not a control plane node.
[kubelet-start] Downloading configuration for the kubelet from the "kubelet-config-1.18" ConfigMap in the kube-system namespace
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[upgrade] The configuration for this node was successfully updated!
[upgrade] Now you should go ahead and upgrade the kubelet package using your package manager.


(4)升级kubelet


yum install -y kubelet-1.18.9-0 --disableexcludes=kubernetes


重启kubelet


$ systemctl daemon-reload
$ systemctl restart kubelet


(5)设置节点可调度


kubectl uncordon ecs-968f-0005


验证集群


(1)、验证集群状态是否正常


$ kubectl get no
NAME            STATUS   ROLES    AGE    VERSION
ecs-968f-0005   Ready    node     102d   v1.18.9
k8s-master      Ready    master   102d   v1.18.9


(2)、验证集群证书是否正常


$ kubeadm alpha certs check-expiration
[check-expiration] Reading configuration from the cluster...
[check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
CERTIFICATE                EXPIRES                  RESIDUAL TIME   CERTIFICATE AUTHORITY   EXTERNALLY MANAGED
admin.conf                 Jan 06, 2022 06:36 UTC   364d                                    no      
apiserver                  Jan 06, 2022 06:24 UTC   364d            ca                      no      
apiserver-etcd-client      Jan 06, 2022 06:24 UTC   364d            etcd-ca                 no      
apiserver-kubelet-client   Jan 06, 2022 06:24 UTC   364d            ca                      no      
controller-manager.conf    Jan 06, 2022 06:24 UTC   364d                                    no      
etcd-healthcheck-client    Sep 25, 2021 06:55 UTC   262d            etcd-ca                 no      
etcd-peer                  Sep 25, 2021 06:55 UTC   262d            etcd-ca                 no      
etcd-server                Sep 25, 2021 06:55 UTC   262d            etcd-ca                 no      
front-proxy-client         Jan 06, 2022 06:24 UTC   364d            front-proxy-ca          no      
scheduler.conf             Jan 06, 2022 06:24 UTC   364d                                    no      
CERTIFICATE AUTHORITY   EXPIRES                  RESIDUAL TIME   EXTERNALLY MANAGED
ca                      Sep 23, 2030 06:55 UTC   9y              no      
etcd-ca                 Sep 23, 2030 06:55 UTC   9y              no      
front-proxy-ca          Sep 23, 2030 06:55 UTC   9y              no


注意:kubeadm upgrade 也会自动对它在此节点上管理的证书进行续约。如果选择不对证书进行续约,可以使用 --certificate-renewal=false


故障恢复


在升级过程中如果升级失败并且没有回滚,可以继续执行kubeadm upgrade。如果要从故障状态恢复,可以执行kubeadm upgrade --force


在升级期间,会在/etc/kubernetes/tmp目录下生成备份文件


  • kubeadm-backup-etcd-
  • kubeadm-backup-manifests-


kubeadm-backup-etcd中包含本地etcd的数据备份,如果升级失败并且无法修复,可以将其数据复制到etcd数据目录进行手动修复。


kubeadm-backup-manifests中保存的是节点静态pod的YAML清单,如果升级失败并且无法修复,可以将其复制到/etc/kubernetes/manifests下进行手动修复

相关实践学习
容器服务Serverless版ACK Serverless 快速入门:在线魔方应用部署和监控
通过本实验,您将了解到容器服务Serverless版ACK Serverless 的基本产品能力,即可以实现快速部署一个在线魔方应用,并借助阿里云容器服务成熟的产品生态,实现在线应用的企业级监控,提升应用稳定性。
云原生实践公开课
课程大纲 开篇:如何学习并实践云原生技术 基础篇: 5 步上手 Kubernetes 进阶篇:生产环境下的 K8s 实践 相关的阿里云产品:容器服务 ACK 容器服务 Kubernetes 版(简称 ACK)提供高性能可伸缩的容器应用管理能力,支持企业级容器化应用的全生命周期管理。整合阿里云虚拟化、存储、网络和安全能力,打造云端最佳容器化应用运行环境。 了解产品详情: https://www.aliyun.com/product/kubernetes
相关文章
|
14小时前
|
存储 运维 Kubernetes
Kubernetes 集群的持续性能优化策略
【5月更文挑战第14天】 在动态且不断扩展的云计算环境中,保持 Kubernetes 集群的高性能运行是一个挑战。本文将探讨一系列实用的性能优化措施,旨在帮助运维专家确保其容器化应用能在资源受限的情况下仍保持高效与稳定。通过分析 Kubernetes 的资源调度机制、存储和网络配置,我们提出了一套综合的性能调优框架,并结合实际案例,展示如何实施这些策略以提升集群的整体性能。
|
14小时前
|
运维 Prometheus 监控
Kubernetes 集群监控与性能优化实践
【5月更文挑战第14天】 在微服务架构日益普及的当下,Kubernetes 已成为容器编排的事实标准。然而,随着集群规模的扩大和业务复杂度的增加,监控系统的性能及稳定性变得至关重要。本文将深入探讨 Kubernetes 集群监控的重要性,介绍常用监控工具,并分享一系列针对集群性能优化的实践策略,帮助运维工程师确保服务的高可用性和优越性能。
|
14小时前
|
Kubernetes 安全 API
Kubernetes学习-集群搭建篇(三) Node配置完善和API概述
Kubernetes学习-集群搭建篇(三) Node配置完善和API概述
Kubernetes学习-集群搭建篇(三) Node配置完善和API概述
|
14小时前
|
Kubernetes 应用服务中间件 Docker
Kubernetes学习-集群搭建篇(二) 部署Node服务,启动JNI网络插件
Kubernetes学习-集群搭建篇(二) 部署Node服务,启动JNI网络插件
|
14小时前
|
存储 运维 Kubernetes
Kubernetes学习-集群搭建篇(一) 搭建Master结点
Kubernetes学习-集群搭建篇(一) 搭建Master结点
|
14小时前
|
Kubernetes API 调度
Kubernetes学习-核心概念篇(二) 集群架构与组件
Kubernetes学习-核心概念篇(二) 集群架构与组件
|
14小时前
|
存储 运维 监控
Kubernetes 集群的持续监控与性能优化策略
【5月更文挑战第11天】在微服务架构日益普及的当下,Kubernetes 已成为容器编排的事实标准。随着其在不同规模企业的广泛采用,如何确保 Kubernetes 集群的高效稳定运行变得至关重要。本文将探讨一套系统的 Kubernetes 集群监控方法,并结合实践经验分享针对性能瓶颈的优化策略。通过实时监控、日志分析与定期审计的结合,旨在帮助运维人员快速定位问题并提出解决方案,从而提升系统的整体表现。
|
14小时前
|
Kubernetes Linux Docker
Kubernetes详解(四)——基于kubeadm的Kubernetes部署
Kubernetes详解(四)——基于kubeadm的Kubernetes部署
16 2
|
14小时前
|
Kubernetes Java API
Kubernetes详解(三)——Kubernetes集群组件
Kubernetes详解(三)——Kubernetes集群组件
16 1
|
14小时前
|
运维 监控 Kubernetes
Kubernetes 集群的监控与维护策略
【5月更文挑战第4天】 在当今微服务架构盛行的时代,容器化技术已成为软件开发和部署的标准实践。Kubernetes 作为一个开源的容器编排平台,因其强大的功能和灵活性而广受欢迎。然而,随着 Kubernetes 集群规模的扩大,集群的监控和维护变得日益复杂。本文将探讨 Kubernetes 集群监控的重要性,分析常见的监控工具,并提出一套有效的集群维护策略,以帮助运维人员确保集群的健康运行和高可用性。
41 10