四,etcd集群的证书传递
要使用外部etcd集群需要先安装etcd集群,然后etcd的证书要分发到全部节点中,本例是四个节点,都需要分发,分发命令为:
在所有节点建立相关目录(四个服务器都执行):
mkdir -p /etc/kubernetes/pki/etcd/
在master1节点复制etcd的证书到上述建立的目录内:
cp /opt/etcd/ssl/ca.pem /etc/kubernetes/pki/etcd/ cp /opt/etcd/ssl/server.pem /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem cp /opt/etcd/ssl/server-key.pem /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem
分发etcd证书到各个节点:
scp /etc/kubernetes/pki/etcd/* master2:/etc/kubernetes/pki/etcd/ scp /etc/kubernetes/pki/etcd/* master3:/etc/kubernetes/pki/etcd/ scp /etc/kubernetes/pki/etcd/* node1:/etc/kubernetes/pki/etcd/
五,kubeadm部署集群可以使用两种形式,第一,命令行初始化,第二,config配置文件初始化。本文选择使用config配置文件的方式:
在192.168.217.19服务器上生成模板文件;
kubeadm config print init-defaults >kubeadm-init1-ha.yaml
编辑此模板文件,最终的内容应该如下:
[root@master1 ~]# cat kubeadm-init-ha.yaml apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: "0" usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.217.19 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock imagePullPolicy: IfNotPresent name: master1 taints: null --- controlPlaneEndpoint: "192.168.217.100" apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: {} etcd: external: endpoints: #下面为自定义etcd集群地址 - https://192.168.217.19:2379 - https://192.168.217.20:2379 - https://192.168.217.21:2379 caFile: /etc/kubernetes/pki/etcd/ca.pem certFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client.pem keyFile: /etc/kubernetes/pki/etcd/apiserver-etcd-client-key.pem imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: 1.22.2 networking: dnsDomain: cluster.local podSubnet: "10.244.0.0/16" serviceSubnet: "10.96.0.0/12" scheduler: {}
kubeadm-init-ha.yaml配置文件说明:
配置说明:
name: 同 /etc/hosts 中统一设置的 hostname 一致,因计划在master1节点执行初始化,因此,是master1的主机名
controlPlaneEndpoint:为vip地址,可以不需要指定端口
imageRepository:由于国内无法访问google镜像仓库k8s.gcr.io,这里指定为阿里云镜像仓库registry.aliyuncs.com/google_containers
podSubnet:指定的IP地址段与后续部署的网络插件相匹配,这里需要部署flannel插件,所以配置为10.244.0.0/16
初始化命令以及该命令的输出日志(在master1节点上执行):
[root@master1 ~]# kubeadm init --config=kubeadm-init-ha.yaml --upload-certs [init] Using Kubernetes version: v1.22.2 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.217.19 192.168.217.100] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] External etcd mode: Skipping etcd/ca certificate authority generation [certs] External etcd mode: Skipping etcd/server certificate generation [certs] External etcd mode: Skipping etcd/peer certificate generation [certs] External etcd mode: Skipping etcd/healthcheck-client certificate generation [certs] External etcd mode: Skipping apiserver-etcd-client certificate generation [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "kubelet.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 11.508717 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643 [mark-control-plane] Marking the node master1 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: abcdef.0123456789abcdef [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e \ --control-plane --certificate-key d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643 Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e
可以看到有两个join命令,这两个join命令是有不同用途的,
第一个join命令:
此命令是用于master节点加入的,也就是带有apiserver的节点加入,因此,在本例中,这个join命令应该用在master2和master3这两个服务器上
kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e \ --control-plane --certificate-key d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643
在master2节点上执行此join命令,输入日志如下:
[root@master2 ~]# kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e --control-plane --certificate-key d14a06a73a7aabf5cce805880c085f7427247d507ba01bf2d2f21c294aeb8643 [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [preflight] Running pre-flight checks before initializing the new control plane instance [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [download-certs] Downloading the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master2] and IPs [10.96.0.1 192.168.217.20 192.168.217.100] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Valid certificates and keys now exist in "/etc/kubernetes/pki" [certs] Using the existing "sa" key [kubeconfig] Generating kubeconfig files [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [kubeconfig] Writing "admin.conf" kubeconfig file [kubeconfig] Writing "controller-manager.conf" kubeconfig file [kubeconfig] Writing "scheduler.conf" kubeconfig file [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [check-etcd] Skipping etcd check in external mode [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... [control-plane-join] using external etcd - no local stacked instance added The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation [mark-control-plane] Marking the node master2 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run 'kubectl get nodes' to see this node join the cluster.
关键代码:
此时master2作为control-plane角色加入了集群,并且打了NoSchedule的污点。
[control-plane-join] using external etcd - no local stacked instance added The 'update-status' phase is deprecated and will be removed in a future release. Currently it performs no operation [mark-control-plane] Marking the node master2 as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers] [mark-control-plane] Marking the node master2 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
同样的在master3服务器上执行此命令,输出基本相同就不废话了。
工作节点join命令:
kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e
此命令在node1节点执行,当然,本例是只有一个工作节点,后续要在添加工作节点,仍然使用此命令即可。
kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e
此命令输出日志如下:
[root@node1 ~]# kubeadm join 192.168.217.100:6443 --token abcdef.0123456789abcdef --discovery-token-ca-cert-hash sha256:f9ea1a25922e715f893c360759c9f56a26b1a3760df4ce140e94610a76ca6a9e [preflight] Running pre-flight checks [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
关键代码:
这个节点加入了集群,并且向apiserver发出了证书申请,申请被接受并返回给此节点了。
This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received
附:1:kubeadm部署完成后的环境变量问题
在上述的master初始化节点日志中,有如下输出:
To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf
建议只在一个节点使用上述命令,例如,只在master1节点上做集群管理工作,那么就在master1节点上执行:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
如果指定在工作节点做集群管理工作,需要将任意一个master节点的/etc/kubernetes/admin.conf 文件拷贝到工作节点的同名目录下,本例是node1节点:
master1节点拷贝文件:
1. [root@master1 ~]# scp /etc/kubernetes/admin.conf node1:/etc/kubernetes/ 2. admin.conf
在node1节点上执行以下命令:
[root@node1 ~]# mkdir -p $HOME/.kube [root@node1 ~]# sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config [root@node1 ~]# sudo chown $(id -u):$(id -g) $HOME/.kube/config
集群管理:
[root@node1 ~]# kubectl get no,po -A NAME STATUS ROLES AGE VERSION node/master1 Ready control-plane,master 22h v1.22.2 node/master2 Ready control-plane,master 22h v1.22.2 node/master3 Ready control-plane,master 21h v1.22.2 node/node1 Ready <none> 21h v1.22.2 NAMESPACE NAME READY STATUS RESTARTS AGE default pod/dns-test 0/1 Completed 0 6h11m default pod/nginx-7fb9867-jfg2r 1/1 Running 2 6h6m kube-system pod/coredns-7f6cbbb7b8-7c85v 1/1 Running 4 20h kube-system pod/coredns-7f6cbbb7b8-h9wtb 1/1 Running 4 20h kube-system pod/kube-apiserver-master1 1/1 Running 7 22h kube-system pod/kube-apiserver-master2 1/1 Running 6 22h kube-system pod/kube-apiserver-master3 0/1 Running 5 21h kube-system pod/kube-controller-manager-master1 1/1 Running 0 71m kube-system pod/kube-controller-manager-master2 1/1 Running 0 67m kube-system pod/kube-controller-manager-master3 0/1 Running 0 3s kube-system pod/kube-flannel-ds-7dmgh 1/1 Running 6 21h kube-system pod/kube-flannel-ds-b99h7 1/1 Running 19 (70m ago) 21h kube-system pod/kube-flannel-ds-hmzj7 1/1 Running 5 21h kube-system pod/kube-flannel-ds-vvld8 0/1 PodInitializing 4 (140m ago) 21h kube-system pod/kube-proxy-nkgdf 1/1 Running 5 21h kube-system pod/kube-proxy-rb9zk 1/1 Running 5 21h kube-system pod/kube-proxy-rvbb7 1/1 Running 7 22h kube-system pod/kube-proxy-xmrp5 1/1 Running 6 22h kube-system pod/kube-scheduler-master1 1/1 Running 0 71m kube-system pod/kube-scheduler-master2 1/1 Running 0 67m kube-system pod/kube-scheduler-master3 0/1 Running 0 3s
附2:kubeadm部署的证书期限问题
[root@master1 ~]# kubeadm certs check-expiration [check-expiration] Reading configuration from the cluster... [check-expiration] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' CERTIFICATE EXPIRES RESIDUAL TIME CERTIFICATE AUTHORITY EXTERNALLY MANAGED admin.conf Oct 27, 2023 11:19 UTC 364d no apiserver Oct 27, 2023 11:19 UTC 364d ca no apiserver-kubelet-client Oct 27, 2023 11:19 UTC 364d ca no controller-manager.conf Oct 27, 2023 11:19 UTC 364d no front-proxy-client Oct 27, 2023 11:19 UTC 364d front-proxy-ca no scheduler.conf Oct 27, 2023 11:19 UTC 364d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Oct 24, 2032 11:19 UTC 9y no front-proxy-ca Oct 24, 2032 11:19 UTC 9y no
这个证书期限问题后面单独写一个博文解释如何修改。
附3:kubeadm部署的一个bug
利用kubeadm部署的集群的状态检查可以看到是不正确的:
[root@node1 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused controller-manager Healthy ok etcd-1 Healthy {"health":"true"} etcd-2 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"}
在三个master节点都操作一次:
[root@master1 ~]# cd /etc/kubernetes/manifests/ [root@master1 manifests]# ll total 12 -rw------- 1 root root 3452 Oct 27 19:19 kube-apiserver.yaml -rw------- 1 root root 2893 Oct 27 19:19 kube-controller-manager.yaml -rw------- 1 root root 1479 Oct 27 19:19 kube-scheduler.yaml
编辑kube-controller-manager.yaml和kube-scheduler.yaml这两个文件,将--port=0 字段删除即可,任何服务都不需要重启,即可看到集群状态正常了:
注意,此命令仅仅是快速查看集群的状态,知道集群的组件是否正常,当然也就知道有哪些组件了,可以看到我这个集群是有一个三节点的etcd集群。
[root@node1 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-2 Healthy {"health":"true"} etcd-0 Healthy {"health":"true"} etcd-1 Healthy {"health":"true"}