使用 kubeadm 初始化 worker节点出现 not ready 故障

简介: 使用 kubeadm 初始化 worker节点出现 not ready 故障

一、遇到的问题

work 节点执行 kubeadm join 命令后集群状态一直显示 not ready,如下的 k8s-node-4

$ kubectl get nodes
NAME                     STATUS     ROLES    AGE    VERSION
k8s-jmeter-1.novalocal   Ready      <none>   17d    v1.18.5
k8s-jmeter-2.novalocal   Ready      <none>   17d    v1.18.5
k8s-jmeter-3.novalocal   Ready      <none>   17d    v1.18.5
k8s-master.novalocal     Ready      master   51d    v1.18.5
k8s-node-1.novalocal     Ready      <none>   51d    v1.18.5
k8s-node-2.novalocal     Ready      <none>   51d    v1.18.5
k8s-node-3.novalocal     Ready      <none>   51d    v1.18.5
k8s-node-4.novalocal     NotReady   <none>   160m   v1.18.5

二、问题排查

首先查看系统 pod 初始化情况:

$ kubectl get pod -n kube-system -o wide
NAME                                           READY   STATUS                  RESTARTS   AGE     IP               NODE                     NOMINATED NODE   READINESS GATES
calico-kube-controllers-5b8b769fcd-srkrb       1/1     Running                 0          3d19h   10.100.185.9     k8s-jmeter-2.novalocal   <none>           <none>
calico-node-5c8xj                              1/1     Running                 10         51d     172.16.106.227   k8s-node-1.novalocal     <none>           <none>
calico-node-9d7rt                              1/1     Running                 8          51d     172.16.106.203   k8s-node-3.novalocal     <none>           <none>
calico-node-crczj                              1/1     Running                 5          51d     172.16.106.226   k8s-node-2.novalocal     <none>           <none>
calico-node-g4hx4                              0/1     Init:ImagePullBackOff   0          99s     172.16.106.219   k8s-node-4.novalocal     <none>           <none>
calico-node-gpmsv                              1/1     Running                 5          17d     172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
calico-node-pz7w5                              1/1     Running                 4          51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
calico-node-r59bw                              1/1     Running                 3          17d     172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
calico-node-xhjj8                              1/1     Running                 4          17d     172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
coredns-66db54ff7f-2cxcp                       1/1     Running                 0          5d22h   10.100.167.140   k8s-node-1.novalocal     <none>           <none>
coredns-66db54ff7f-gptgt                       1/1     Running                 0          5d22h   10.100.41.31     k8s-master.novalocal     <none>           <none>
eip-nfs-nfs-storage-6fddcc8f9d-hqv7m           1/1     Running                 0          3d19h   10.100.185.4     k8s-jmeter-2.novalocal   <none>           <none>
etcd-k8s-master.novalocal                      1/1     Running                 0          5d21h   172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-apiserver-k8s-master.novalocal            1/1     Running                 14         51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-controller-manager-k8s-master.novalocal   1/1     Running                 56         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-proxy-5msrp                               1/1     Running                 1          9d      172.16.106.226   k8s-node-2.novalocal     <none>           <none>
kube-proxy-64pkw                               1/1     Running                 2          9d      172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
kube-proxy-6j2fw                               1/1     Running                 1          9d      172.16.106.203   k8s-node-3.novalocal     <none>           <none>
kube-proxy-7cptn                               1/1     Running                 0          157m    172.16.106.219   k8s-node-4.novalocal     <none>           <none>
kube-proxy-fkt9p                               1/1     Running                 1          9d      172.16.106.227   k8s-node-1.novalocal     <none>           <none>
kube-proxy-fxvjb                               1/1     Running                 4          9d      172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
kube-proxy-wnj2l                               1/1     Running                 2          9d      172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
kube-proxy-wnzqg                               1/1     Running                 0          9d      172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-scheduler-k8s-master.novalocal            1/1     Running                 48         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kuboard-5cc4bcccd7-t8h8f                       1/1     Running                 0          21h     10.100.185.24    k8s-jmeter-2.novalocal   <none>           <none>
metrics-server-677dcb8b4d-jtpgd                1/1     Running                 0          3d20h   172.16.106.227   k8s-node-1.novalocal     <none>           <none>

通过结果我们可以看到 node-4 的 calico 组件未初始化成功,Pod 状态显示为 ImagePullBackoff

三、问题解决

1、获取容器镜像

通过命令获取 Pod 所使用的容器镜像:

$ kubectl get pods calico-node-7vrgx -n kube-system -o yaml | grep image:
            f:image: {}
            f:image: {}
            f:image: {}
            f:image: {}
    image: calico/node:v3.13.1
    image: calico/cni:v3.13.1
    image: calico/cni:v3.13.1
  - image: calico/pod2daemon-flexvol:v3.13.1
  - image: calico/node:v3.13.1
  - image: calico/cni:v3.13.1
  - image: calico/cni:v3.13.1
  - image: calico/pod2daemon-flexvol:v3.13.1

我们可以看到此 calico Pod 主要使用以下三个 image:

  • calico/node:v3.13.1
  • calico/cni:v3.13.1
  • calico/pod2daemon-flexvol:v3.13.1

2、下载镜像

找到 node-4 主机,在上面执行:

$ docker pull calico/node:v3.13.1
$ docker pull calico/cni:v3.13.1  
$ docker pull calico/pod2daemon-flexvol:v3.13.1

3、离线镜像

如果 docker pull 无法下载镜像,可以考虑从其他节点导出 calico 插件的镜像:

# 保存镜像到本地
$ docker save image_id -o xxxx.tar

# 拷贝镜像到 work 节点
$ scp xxxx.tar root@k8s-node-4:/root/

# 装载镜像
$ docker load -i xxxx.tar

# 给镜像打tag
$ docker tag image_id tag

4、重新创建pod

在 master 删除原有的 pod:

$ kubectl delete pod calico-node-g4hx4 -n kube-system

等一会重新查看 pod 状态:

$ kubectl get pod -n kube-system -o wide
NAME                                           READY   STATUS    RESTARTS   AGE     IP               NODE                     NOMINATED NODE   READINESS GATES
calico-kube-controllers-5b8b769fcd-srkrb       1/1     Running   0          3d19h   10.100.185.9     k8s-jmeter-2.novalocal   <none>           <none>
calico-node-5c7hn                              0/1     Running   0          8s      172.16.106.219   k8s-node-4.novalocal     <none>           <none>
calico-node-5c8xj                              1/1     Running   10         51d     172.16.106.227   k8s-node-1.novalocal     <none>           <none>
calico-node-9d7rt                              1/1     Running   8          51d     172.16.106.203   k8s-node-3.novalocal     <none>           <none>
calico-node-crczj                              1/1     Running   5          51d     172.16.106.226   k8s-node-2.novalocal     <none>           <none>
calico-node-gpmsv                              1/1     Running   5          17d     172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
calico-node-pz7w5                              1/1     Running   4          51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
calico-node-r59bw                              1/1     Running   3          17d     172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
calico-node-xhjj8                              1/1     Running   4          17d     172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
coredns-66db54ff7f-2cxcp                       1/1     Running   0          5d22h   10.100.167.140   k8s-node-1.novalocal     <none>           <none>
coredns-66db54ff7f-gptgt                       1/1     Running   0          5d22h   10.100.41.31     k8s-master.novalocal     <none>           <none>
eip-nfs-nfs-storage-6fddcc8f9d-hqv7m           1/1     Running   0          3d19h   10.100.185.4     k8s-jmeter-2.novalocal   <none>           <none>
etcd-k8s-master.novalocal                      1/1     Running   0          5d21h   172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-apiserver-k8s-master.novalocal            1/1     Running   14         51d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-controller-manager-k8s-master.novalocal   1/1     Running   56         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-proxy-5msrp                               1/1     Running   1          9d      172.16.106.226   k8s-node-2.novalocal     <none>           <none>
kube-proxy-64pkw                               1/1     Running   2          9d      172.16.106.210   k8s-jmeter-3.novalocal   <none>           <none>
kube-proxy-6j2fw                               1/1     Running   1          9d      172.16.106.203   k8s-node-3.novalocal     <none>           <none>
kube-proxy-7cptn                               1/1     Running   0          160m    172.16.106.219   k8s-node-4.novalocal     <none>           <none>
kube-proxy-fkt9p                               1/1     Running   1          9d      172.16.106.227   k8s-node-1.novalocal     <none>           <none>
kube-proxy-fxvjb                               1/1     Running   4          9d      172.16.106.209   k8s-jmeter-1.novalocal   <none>           <none>
kube-proxy-wnj2l                               1/1     Running   2          9d      172.16.106.216   k8s-jmeter-2.novalocal   <none>           <none>
kube-proxy-wnzqg                               1/1     Running   0          9d      172.16.106.200   k8s-master.novalocal     <none>           <none>
kube-scheduler-k8s-master.novalocal            1/1     Running   48         16d     172.16.106.200   k8s-master.novalocal     <none>           <none>
kuboard-5cc4bcccd7-t8h8f                       1/1     Running   0          21h     10.100.185.24    k8s-jmeter-2.novalocal   <none>           <none>
metrics-server-677dcb8b4d-jtpgd                1/1     Running   0          3d20h   172.16.106.227   k8s-node-1.novalocal     <none>           <none>

我们看到所有的 Pod 都已经处于正常状态,这时候查看下 node 的状态:

$ kubectl get nodes
NAME                     STATUS   ROLES    AGE    VERSION
k8s-jmeter-1.novalocal   Ready    <none>   17d    v1.18.5
k8s-jmeter-2.novalocal   Ready    <none>   17d    v1.18.5
k8s-jmeter-3.novalocal   Ready    <none>   17d    v1.18.5
k8s-master.novalocal     Ready    master   51d    v1.18.5
k8s-node-1.novalocal     Ready    <none>   51d    v1.18.5
k8s-node-2.novalocal     Ready    <none>   51d    v1.18.5
k8s-node-3.novalocal     Ready    <none>   51d    v1.18.5
k8s-node-4.novalocal     Ready    <none>   161m   v1.18.5

以上,初始化的 node 状态恢复正常。

相关实践学习
通过Ingress进行灰度发布
本场景您将运行一个简单的应用,部署一个新的应用用于新的发布,并通过Ingress能力实现灰度发布。
容器应用与集群管理
欢迎来到《容器应用与集群管理》课程,本课程是“云原生容器Clouder认证“系列中的第二阶段。课程将向您介绍与容器集群相关的概念和技术,这些概念和技术可以帮助您了解阿里云容器服务ACK/ACK Serverless的使用。同时,本课程也会向您介绍可以采取的工具、方法和可操作步骤,以帮助您了解如何基于容器服务ACK Serverless构建和管理企业级应用。 学习完本课程后,您将能够: 掌握容器集群、容器编排的基本概念 掌握Kubernetes的基础概念及核心思想 掌握阿里云容器服务ACK/ACK Serverless概念及使用方法 基于容器服务ACK Serverless搭建和管理企业级网站应用
目录
相关文章
|
存储 Prometheus Kubernetes
k8s安装kube-promethues(超详细)
k8s安装kube-promethues(超详细)
6273 0
k8s安装kube-promethues(超详细)
|
Kubernetes 容器 Perl
Kubernetes-kubectl命令出现错误【The connection to the server localhost:8080 was refused - did you specif...
  今天在Kubernetes的从节点上运行命令【kubectl】出现了如下错误 [root@k8snode1 kubernetes]# kubectl get pod The connection to the server localhost:80...
38265 0
|
运维 Kubernetes API
k8s集群新增master 、work节点重新生成token、certificate-key等操作
k8s集群运维中通常会存在新增和删除节点,一些token、certificate-key信息 会被遗忘,怎么去重生成了 可以按照以下方式
4445 0
|
10月前
|
Kubernetes 容器
k8s集群部署成功后某个节点突然出现notready状态的问题原因分析和解决办法
k8s集群部署成功后某个节点突然出现notready状态的问题原因分析和解决办法
567 0
|
10月前
|
存储 Kubernetes 调度
k8s常见的排错指南Node,svc,Pod等以及K8s网络不通问题
k8s常见的排错指南Node,svc,Pod等以及K8s网络不通问题
3496 1
|
域名解析 Kubernetes 网络协议
Kubernetes 部署 MySQL 集群
在有状态应用中,MySQL是我们最常见也是最常用的。本文我们将实战部署一个一组多从的MySQL集群。
7213 0
Kubernetes 部署 MySQL 集群
|
存储 Linux
CentOS 7 上安装和使用 FFmpeg
FFmpeg 是一个用于处理多媒体文件的免费开源工具集合。它包含一组共享的音频和视频库,例如 libavcodec、libavformat 和 libavutil。使用 FFmpeg,您可以在各种视频和音频格式之间进行转换、设置采样率、捕获流音频/视频以及调整视频大小 #云库工具#。
1058 1
|
canal Kubernetes 安全
【K8S系列】深入解析k8s网络插件—Flannel
【K8S系列】深入解析k8s网络插件—Flannel
3066 0
|
10月前
|
Kubernetes Perl 容器
在 Kubernetes 中重启 pod 的 3 种方法
【4月更文挑战第25天】
6049 1
在 Kubernetes 中重启 pod 的 3 种方法
|
10月前
|
Kubernetes Ubuntu 应用服务中间件
在Ubuntu22.04 LTS上搭建Kubernetes集群
在Ubuntu22.04.4上安装Kubernetes v1.28.7,步骤超详细
3374 3
在Ubuntu22.04 LTS上搭建Kubernetes集群