更新镜像没反应 k8s组件异常
①故障现象:使用脚本更新pod服务镜像,pod无变化,状态没更新;
更新镜像:无变化
kubectl set image deployment/em-api em-api=192.168.90.10/zhufuc/em-api:v1.0-20201110100058
pod状态:
em-api-86855df489-hmvnr 1/1 Running 0 16m 172.18.94.8 k8s-n5 <none> <none>
解决思路:检查脚本,检查私有仓库,检查步骤,查看日志,查看k8s组件
问题:k8s组件状态异常,controller-manager 异常
经排查脚本没问题,私有仓库没问题,查看日志
journalctl -f -u kubelet
Nov 10 10:11:22 k8s-m1 kubelet[32270]: E1110 10:11:22.165336 32270 kuberuntime_sandbox.go:65] CreatePodSandbox for pod "traefik-ingress-controller-jjss4_kube-system(63eee933-933a-11e9-928a-fefcfe274f71)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "traefik-ingress-controller-jjss4": Error response from daemon: driver failed programming external connectivity on endpoint k8s_POD_traefik-ingress-controller-jjss4_kube-system_63eee933-933a-11e9-928a-fefcfe274f71_17529857 (ec7dbca09838629f1e4825175f4be3819723cb1984c9b7d00c2ed499b834fa5a): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.18.88.14:8080 ! -i docker0: iptables: No chain/target/match by that name.
Nov 10 10:11:22 k8s-m1 kubelet[32270]: (exit status 1))
Nov 10 10:11:22 k8s-m1 kubelet[32270]: E1110 10:11:22.165351 32270 kuberuntime_manager.go:662] createPodSandbox for pod "traefik-ingress-controller-jjss4_kube-system(63eee933-933a-11e9-928a-fefcfe274f71)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "traefik-ingress-controller-jjss4": Error response from daemon: driver failed programming external connectivity on endpoint k8s_POD_traefik-ingress-controller-jjss4_kube-system_63eee933-933a-11e9-928a-fefcfe274f71_17529857 (ec7dbca09838629f1e4825175f4be3819723cb1984c9b7d00c2ed499b834fa5a): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.18.88.14:8080 ! -i docker0: iptables: No chain/target/match by that name.
Nov 10 10:11:22 k8s-m1 kubelet[32270]: (exit status 1))
Nov 10 10:11:22 k8s-m1 kubelet[32270]: E1110 10:11:22.165417 32270 pod_workers.go:190] Error syncing pod 63eee933-933a-11e9-928a-fefcfe274f71 ("traefik-ingress-controller-jjss4_kube-system(63eee933-933a-11e9-928a-fefcfe274f71)"), skipping: failed to "CreatePodSandbox" for "traefik-ingress-controller-jjss4_kube-system(63eee933-933a-11e9-928a-fefcfe274f71)" with CreatePodSandboxError: "CreatePodSandbox for pod \"traefik-ingress-controller-jjss4_kube-system(63eee933-933a-11e9-928a-fefcfe274f71)\" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod \"traefik-ingress-controller-jjss4\": Error response from daemon: driver failed programming external connectivity on endpoint k8s_POD_traefik-ingress-controller-jjss4_kube-system_63eee933-933a-11e9-928a-fefcfe274f71_17529857 (ec7dbca09838629f1e4825175f4be3819723cb1984c9b7d00c2ed499b834fa5a): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.18.88.14:8080 ! -i docker0: iptables: No chain/target/match by that name.\n (exit status 1))"
Nov 10 10:11:22 k8s-m1 kubelet[32270]: I1110 10:11:22.165608 32270 server.go:459] Event(v1.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"traefik-ingress-controller-jjss4", UID:"63eee933-933a-11e9-928a-fefcfe274f71", APIVersion:"v1", ResourceVersion:"13081364", FieldPath:""}): type: 'Warning' reason: 'FailedCreatePodSandBox' Failed create pod sandbox: rpc error: code = Unknown desc = failed to start sandbox container for pod "traefik-ingress-controller-jjss4": Error response from daemon: driver failed programming external connectivity on endpoint k8s_POD_traefik-ingress-controller-jjss4_kube-system_63eee933-933a-11e9-928a-fefcfe274f71_17529857 (ec7dbca09838629f1e4825175f4be3819723cb1984c9b7d00c2ed499b834fa5a): (iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 8080 -j DNAT --to-destination 172.18.88.14:8080 ! -i docker0: iptables: No chain/target/match by that name.
可以看到controller是有问题的,使用命令查看一下k8s的组件
kubectl get cs
NAME STATUS MESSAGE ERROR
scheduler Healthy ok
controller-manager Unhealthy Get http://127.0.0.1:10252/healthz: net/http: HTTP/1.x transport connection broken: malformed HTTP response "\x15\x03\x01\x00\x02\x02"
etcd-1 Healthy {"health":"true"}
etcd-0 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
确实是controller-manager问题
查看状态是否有报错信息,根据实际情况,我的直接重启一下就好了
systemctl status kube-controller-manager -l
systemctl restart kube-controller-manager
测试:
组件状态:
[root@k8s-m1 script]# kubectl get cs
NAME STATUS MESSAGE ERROR
controller-manager Healthy ok
scheduler Healthy ok
etcd-0 Healthy {"health":"true"}
etcd-1 Healthy {"health":"true"}
etcd-2 Healthy {"health":"true"}
更新镜像
kubectl set image deployment/em-api em-api=192.168.90.10/zhufuc/em-api:v1.0-20201110100058
pod状态:
em-api-7c7f76dcdc-kdr5c 0/1 ContainerCreating 0 0s <none> k8s-n5 <none> <none>
em-api-86855df489-hmvnr 1/1 Running 0 16m 172.18.94.8 k8s-n5 <none> <none>
成功解决!
k8s容器间无法通讯 网络出现问题
node01节点无法ping其他节点容器的IP,同样node01节点的容器也ping不通其他节点容器IP
示例:其他节点的容器去进行ping
node01无法通讯
用在node05的容器去ping在node02的容器IP和node01的容器IP
查看一下k8s的组件插件
kubectl get pods --all-namespaces
可以看到有的服务出现异常可以查看详情
kubectl describe pod -n kube-system [服务名字]
把服务异常的pod都进行重启
kubectl delete pod -n kube-system kube-proxy-2mzcp
查看状态
测试
node01去ping其他容器的ip
成功