使用 kubeadm 初始化 worker节点出现 not ready 故障

简介: 使用 kubeadm 初始化 worker节点出现 not ready 故障

一、遇到的问题

work 节点执行 kubeadm join 命令后集群状态一直显示 not ready,如下的 k8s-node-4

$ kubectl get nodes
NAME                     STATUS     ROLES    AGE    VERSION
k8s-jmeter-1.novalocal   Ready      <none>   17d    v1.18.5
k8s-jmeter-2.novalocal   Ready      <none>   17d    v1.18.5
k8s-jmeter-3.novalocal   Ready      <none>   17d    v1.18.5
k8s-master.novalocal     Ready      master   51d    v1.18.5
k8s-node-1.novalocal     Ready      <none>   51d    v1.18.5
k8s-node-2.novalocal     Ready      <none>   51d    v1.18.5
k8s-node-3.novalocal     Ready      <none>   51d    v1.18.5
k8s-node-4.novalocal     NotReady   <none>   160m   v1.18.5

二、问题排查

首先查看系统 pod 初始化情况:

$ kubectl get pod -n kube-system -o wide
NAME                                           READY   STATUS                  RESTARTS   AGE     IP               NODE                     NOMINATED NODE   READINESS GATES
calico-kube-controllers-5b8b769fcd-srkrb       1/1     Running                 0          3d19h   10.100.185.9     k8s-jmeter-2.novalocal   <none>           <none>
calico-node-5c8xj                              1/1     Running                 10         51d     172.16.106.227   k8s-node-1.novalocal     <none>           <none>
calico-node-9d7rt                              1/1     Running                 8          51d     172.16.106.203   k8s-node-3.novalocal     <none>           <none>
calico-node-crczj                              1/1     Running                 5          51d     172.16.106.226   k8s-node-2.novalocal     <none>           <none>
calico-node-g4hx4                              0/1     Init:ImagePullBackOff   0          99s     172.16.106.219   k8s-node-4.novalocal     <none>           <none>
calico-node-gpmsv                              1/1     Running                 5          17d     172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
calico-node-pz7w5                              1/1     Running                 4          51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
calico-node-r59bw                              1/1     Running                 3          17d     172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
calico-node-xhjj8                              1/1     Running                 4          17d     172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
coredns-66db54ff7f-2cxcp                       1/1     Running                 0          5d22h   10.100.167.140   k8s-node-1.novalocal     <none>           <none>
coredns-66db54ff7f-gptgt                       1/1     Running                 0          5d22h   10.100.41.31     k8s-master.novalocal     <none>           <none>
eip-nfs-nfs-storage-6fddcc8f9d-hqv7m           1/1     Running                 0          3d19h   10.100.185.4     k8s-jmeter-2.novalocal   <none>           <none>
etcd-k8s-master.novalocal                      1/1     Running                 0          5d21h   172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-apiserver-k8s-master.novalocal            1/1     Running                 14         51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-controller-manager-k8s-master.novalocal   1/1     Running                 56         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-proxy-5msrp                               1/1     Running                 1          9d      172.16.106.226   k8s-node-2.novalocal     <none>           <none>
kube-proxy-64pkw                               1/1     Running                 2          9d      172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
kube-proxy-6j2fw                               1/1     Running                 1          9d      172.16.106.203   k8s-node-3.novalocal     <none>           <none>
kube-proxy-7cptn                               1/1     Running                 0          157m    172.16.106.219   k8s-node-4.novalocal     <none>           <none>
kube-proxy-fkt9p                               1/1     Running                 1          9d      172.16.106.227   k8s-node-1.novalocal     <none>           <none>
kube-proxy-fxvjb                               1/1     Running                 4          9d      172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
kube-proxy-wnj2l                               1/1     Running                 2          9d      172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
kube-proxy-wnzqg                               1/1     Running                 0          9d      172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-scheduler-k8s-master.novalocal            1/1     Running                 48         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kuboard-5cc4bcccd7-t8h8f                       1/1     Running                 0          21h     10.100.185.24    k8s-jmeter-2.novalocal   <none>           <none>
metrics-server-677dcb8b4d-jtpgd                1/1     Running                 0          3d20h   172.16.106.227   k8s-node-1.novalocal     <none>           <none>

通过结果我们可以看到 node-4 的 calico 组件未初始化成功,Pod 状态显示为 ImagePullBackoff

三、问题解决

1、获取容器镜像

通过命令获取 Pod 所使用的容器镜像:

$ kubectl get pods calico-node-7vrgx -n kube-system -o yaml | grep image:
            f:image: {}
            f:image: {}
            f:image: {}
            f:image: {}
    image: calico/node:v3.13.1
    image: calico/cni:v3.13.1
    image: calico/cni:v3.13.1
  - image: calico/pod2daemon-flexvol:v3.13.1
  - image: calico/node:v3.13.1
  - image: calico/cni:v3.13.1
  - image: calico/cni:v3.13.1
  - image: calico/pod2daemon-flexvol:v3.13.1

我们可以看到此 calico Pod 主要使用以下三个 image:

  • calico/node:v3.13.1
  • calico/cni:v3.13.1
  • calico/pod2daemon-flexvol:v3.13.1

2、下载镜像

找到 node-4 主机,在上面执行:

$ docker pull calico/node:v3.13.1
$ docker pull calico/cni:v3.13.1  
$ docker pull calico/pod2daemon-flexvol:v3.13.1

3、离线镜像

如果 docker pull 无法下载镜像,可以考虑从其他节点导出 calico 插件的镜像:

# 保存镜像到本地
$ docker save image_id -o xxxx.tar

# 拷贝镜像到 work 节点
$ scp xxxx.tar root@k8s-node-4:/root/

# 装载镜像
$ docker load -i xxxx.tar

# 给镜像打tag
$ docker tag image_id tag

4、重新创建pod

在 master 删除原有的 pod:

$ kubectl delete pod calico-node-g4hx4 -n kube-system

等一会重新查看 pod 状态:

$ kubectl get pod -n kube-system -o wide
NAME                                           READY   STATUS    RESTARTS   AGE     IP               NODE                     NOMINATED NODE   READINESS GATES
calico-kube-controllers-5b8b769fcd-srkrb       1/1     Running   0          3d19h   10.100.185.9     k8s-jmeter-2.novalocal   <none>           <none>
calico-node-5c7hn                              0/1     Running   0          8s      172.16.106.219   k8s-node-4.novalocal     <none>           <none>
calico-node-5c8xj                              1/1     Running   10         51d     172.16.106.227   k8s-node-1.novalocal     <none>           <none>
calico-node-9d7rt                              1/1     Running   8          51d     172.16.106.203   k8s-node-3.novalocal     <none>           <none>
calico-node-crczj                              1/1     Running   5          51d     172.16.106.226   k8s-node-2.novalocal     <none>           <none>
calico-node-gpmsv                              1/1     Running   5          17d     172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
calico-node-pz7w5                              1/1     Running   4          51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
calico-node-r59bw                              1/1     Running   3          17d     172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
calico-node-xhjj8                              1/1     Running   4          17d     172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
coredns-66db54ff7f-2cxcp                       1/1     Running   0          5d22h   10.100.167.140   k8s-node-1.novalocal     <none>           <none>
coredns-66db54ff7f-gptgt                       1/1     Running   0          5d22h   10.100.41.31     k8s-master.novalocal     <none>           <none>
eip-nfs-nfs-storage-6fddcc8f9d-hqv7m           1/1     Running   0          3d19h   10.100.185.4     k8s-jmeter-2.novalocal   <none>           <none>
etcd-k8s-master.novalocal                      1/1     Running   0          5d21h   172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-apiserver-k8s-master.novalocal            1/1     Running   14         51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-controller-manager-k8s-master.novalocal   1/1     Running   56         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-proxy-5msrp                               1/1     Running   1          9d      172.16.106.226   k8s-node-2.novalocal     <none>           <none>
kube-proxy-64pkw                               1/1     Running   2          9d      172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
kube-proxy-6j2fw                               1/1     Running   1          9d      172.16.106.203   k8s-node-3.novalocal     <none>           <none>
kube-proxy-7cptn                               1/1     Running   0          160m    172.16.106.219   k8s-node-4.novalocal     <none>           <none>
kube-proxy-fkt9p                               1/1     Running   1          9d      172.16.106.227   k8s-node-1.novalocal     <none>           <none>
kube-proxy-fxvjb                               1/1     Running   4          9d      172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
kube-proxy-wnj2l                               1/1     Running   2          9d      172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
kube-proxy-wnzqg                               1/1     Running   0          9d      172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-scheduler-k8s-master.novalocal            1/1     Running   48         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kuboard-5cc4bcccd7-t8h8f                       1/1     Running   0          21h     10.100.185.24    k8s-jmeter-2.novalocal   <none>           <none>
metrics-server-677dcb8b4d-jtpgd                1/1     Running   0          3d20h   172.16.106.227   k8s-node-1.novalocal     <none>           <none>

我们看到所有的 Pod 都已经处于正常状态,这时候查看下 node 的状态:

$ kubectl get nodes
NAME                     STATUS   ROLES    AGE    VERSION
k8s-jmeter-1.novalocal   Ready    <none>   17d    v1.18.5
k8s-jmeter-2.novalocal   Ready    <none>   17d    v1.18.5
k8s-jmeter-3.novalocal   Ready    <none>   17d    v1.18.5
k8s-master.novalocal     Ready    master   51d    v1.18.5
k8s-node-1.novalocal     Ready    <none>   51d    v1.18.5
k8s-node-2.novalocal     Ready    <none>   51d    v1.18.5
k8s-node-3.novalocal     Ready    <none>   51d    v1.18.5
k8s-node-4.novalocal     Ready    <none>   161m   v1.18.5

以上,初始化的 node 状态恢复正常。

相关实践学习
通过Ingress进行灰度发布
本场景您将运行一个简单的应用,部署一个新的应用用于新的发布,并通过Ingress能力实现灰度发布。
容器应用与集群管理
欢迎来到《容器应用与集群管理》课程,本课程是“云原生容器Clouder认证“系列中的第二阶段。课程将向您介绍与容器集群相关的概念和技术,这些概念和技术可以帮助您了解阿里云容器服务ACK/ACK Serverless的使用。同时,本课程也会向您介绍可以采取的工具、方法和可操作步骤,以帮助您了解如何基于容器服务ACK Serverless构建和管理企业级应用。 学习完本课程后,您将能够: 掌握容器集群、容器编排的基本概念 掌握Kubernetes的基础概念及核心思想 掌握阿里云容器服务ACK/ACK Serverless概念及使用方法 基于容器服务ACK Serverless搭建和管理企业级网站应用
目录
相关文章
|
Kubernetes Linux Docker
kubelet 压力驱逐 - The node had condition:[DiskPressure]
kubelet 压力驱逐 - The node had condition:[DiskPressure]
1286 0
|
Kubernetes 容器
k8s集群—node节点的删除与添加
k8s集群—node节点的删除与添加
554 0
|
3月前
|
Docker 容器
|
3月前
|
安全 Docker 容器
|
5月前
|
Kubernetes 网络安全 Docker
在k8S中,Worker节点加入集群的过程是什么?
在k8S中,Worker节点加入集群的过程是什么?
|
5月前
|
Kubernetes 网络安全 API
在K8S中,集群内有个节点not ready,如何排查?
在K8S中,集群内有个节点not ready,如何排查?
|
5月前
|
数据采集 监控 Kubernetes
在k8S中,kubelet监控Worker节点资源是使用什么组件来实现的?
在k8S中,kubelet监控Worker节点资源是使用什么组件来实现的?
|
5月前
|
Kubernetes 容器 Perl
Kubernetes(K8S) Node NotReady 节点资源不足 Pod无法运行
Kubernetes(K8S) Node NotReady 节点资源不足 Pod无法运行
94 0
|
8月前
|
关系型数据库 数据库 OceanBase
重启集群中所有节点的 observer 进程
重启集群中所有节点的 observer 进程
71 0
|
XML 数据格式 Python
Launching nodes:启动节点
Launching nodes:启动节点
150 0
Launching nodes:启动节点

热门文章

最新文章

下一篇
开通oss服务