前言:
kubernetes集群的部署工作是比较繁琐的,但kubeadm使得急速部署集群成为了一种可能,离线化的部署可以提高部署的效率,使得网络(各种镜像经常下载不了或者下载缓慢)不是部署工作的瓶颈。
OK,下面就讲解一哈如何利用kubeadm急速部署一个简单的可用于测试的kubernetes集群(如果对linux比较熟练的话,可以在5分钟内就部署完成)。
一,本次实践的服务器以及需要安装的组件的情况说明
计划使用三台VMware虚拟机做这个集群,当然,物理机也是一样的道理,在此就不啰嗦了。
kubernetes集群配置说明
服务器IP地址 | 操作系统 | 硬件配置 | 系统内核版本 | 安装的组件 |
192.168.217.19 | CentOS Linux release 7.4.1708 (Core) | 2核2c 内存4G 磁盘空间 100G |
Linux master 5.16.9-1.el7.elrepo.x86_64 | 集群角色:master,安装kubeadm和kubelet,docker环境,docker的版本为ce 20.10.5 |
192.168.217.20 | CentOS Linux release 7.4.1708 (Core) | 2核2c 内存4G 磁盘空间 100G |
Linux master 5.16.9-1.el7.elrepo.x86_64 | 集群角色:node,安装kubelet,docker环境,docker的版本为ce 20.10.5 |
192.168.217.21 | CentOS Linux release 7.4.1708 (Core) | 2核2c 内存4G 磁盘空间 100G |
Linux master 5.16.9-1.el7.elrepo.x86_64 | 集群角色:node,安装kubelet,docker环境,docker的版本为ce 20.10.5 |
二,
先决条件:
A,时间服务器
时间服务器的用途就不多说了,必须要有的一个重要组件。
三个节点都执行命令:
yum install ntp -y && systemctl enable ntpd
19服务器的/etc/ntp.conf 配置文件内:
server 127.127.1.0 prefer fudge 127.127.1.0 stratum 10
20和21服务器的/etc/ntp.conf配置文件内:
server 192.168.217.19
三台服务器都执行命令,以重启时间服务器:
systemctl restart ntpd
以上的配置表明以19服务器为主时间服务器,其它节点与此服务器时间同步,在工作节点上执行以下命令,输出为此,表示时间服务器正常:
[root@node2 ~]# ntpq -p remote refid st t when poll reach delay offset jitter ============================================================================== *master LOCAL(0) 11 u 50 256 377 0.517 0.024 0.076 [root@node2 ~]# ntpstat synchronised to NTP server (192.168.217.19) at stratum 12 time correct to within 25 ms polling server every 256 s
B,集群服务器的免 密码ssh
服务器之间的免密就不在此啰嗦了,实在是太基础的东西了。
C,集群服务器的防火墙关闭以及swap交换内存关闭,selinux关闭
关闭防火墙(三台服务器都执行):
systemctl disable firewalld && systemctl stop firewalld
关闭swap交互内存:
swapoff -a
selinux的关闭就不说了,直接看测试结果,如下表示已关闭:
[root@node2 ~]# getenforce Disabled
D,主机名的固定,三台服务器统一这个hosts
[root@node2 ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.217.19 master k8s-master 192.168.217.20 node1 k8s-node1 192.168.217.21 node2 k8s-node2
F,本地yum仓库以及基础软件
本地仓库的搭建见我的博客:Linux的完全本地仓库搭建指南(科普扫盲贴)_晚风_END的博客-CSDN博客_linux创建仓库
G:docker环境的搭建
docker环境的搭建见我的博客:docker的离线安装以及本地化配置_晚风_END的博客-CSDN博客
三,
上传离线安装包到服务器内,离线安装包如下:
[root@master ~]# ll total 3016 -rw-r--r-- 1 root root 3069556 Oct 22 21:28 flannel #网络插件 drwxr-xr-x 3 root root 4096 Oct 22 21:10 k8s-offline-rpm #rpm安装包,这个需要挂载为仓库 drwxr-xr-x 2 root root 4096 Oct 22 21:11 kubeadin-offline-image #docker镜像 -rw-r--r-- 1 root root 4813 Oct 22 21:18 kube-flannel.yml #flannel的部署清单文件 #以上文件都放置在root目录下,node节点只需要前面三个,不需要kube-flannel.yml 文件
离线安装包下载链接:
链接:https://pan.baidu.com/s/1_vlzm3YMxOewIx2HaDOKJg?pwd=sdaa
提取码:sdaa
本地仓库的开启:
cat > /etc/yum.repos.d/k8s.repo <<EOF [k8s] name=k8s baseurl=file:///root/k8s-offline-rpm enable=1 gpgcheck=0 EOF
导入离线的docker镜像:
cd kubeadin-offline-image for i in `ls /root/kubeadin-offline-image`;do docker load -i $i;done
赋予flannel插件的可执行权限:
chmod a+x flannel
安装相关软件:
yum install -y kubeadm-1.22.2 kubelet-1.22.2 kubectl-1.22.2 conntrack-tools libseccomp \ libtool-ltdl device-mapper-persistent-data lvm2
四,集群初始化以及工作节点加入
方式一----命令初始化:
因为前面yum下载的是1.22.2,因此,这里的版本也指定的是1.22.2,apiserver-advertise-address这里是master服务器的IP
kubeadm init \ --apiserver-advertise-address=192.168.217.19 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.22.2 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16
初始化命令的输出如下:
[init] Using Kubernetes version: v1.22.2 [preflight] Running pre-flight checks [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service' [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using 'kubeadm config images pull' [certs] Using certificateDir folder "/etc/kubernetes/pki" #kubeadm自动生成相关证书 [certs] Generating "ca" certificate and key#kubeadm自动生成相关证书 [certs] Generating "apiserver" certificate and key#kubeadm自动生成相关证书 [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master] and IPs [10.96.0.1 192.168.217.19]#kubeadm自动生成相关证书,这里提示了DNS是10.96.0.1 [certs] Generating "apiserver-kubelet-client" certificate and key#kubeadm自动生成相关证书 [certs] Generating "front-proxy-ca" certificate and key#kubeadm自动生成相关证书 [certs] Generating "front-proxy-client" certificate and key#kubeadm自动生成相关证书 [certs] Generating "etcd/ca" certificate and key#kubeadm自动生成相关证书 [certs] Generating "etcd/server" certificate and key#kubeadm自动生成相关证书 [certs] etcd/server serving cert is signed for DNS names [localhost master] and IPs [192.168.217.19 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key#kubeadm自动生成相关证书,etcd的证书 [certs] etcd/peer serving cert is signed for DNS names [localhost master] and IPs [192.168.217.19 127.0.0.1 ::1]#kubeadm自动生成相关证书,etcd的证书 [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes"#使用此目录下的配置文件 [kubeconfig] Writing "admin.conf" kubeconfig file#生成配置文件 [kubeconfig] Writing "kubelet.conf" kubeconfig file#生成kubelet的相关配置文件 [kubeconfig] Writing "controller-manager.conf" kubeconfig file#生成controller的配置文件 [kubeconfig] Writing "scheduler.conf" kubeconfig file#生成schedule的配置文件 [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"#kubelet的配置文件 [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"#还是kubelet的配置文件 [kubelet-start] Starting the kubelet#启动kubelet服务 [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver"#启动静态pod kube-apiserver [control-plane] Creating static Pod manifest for "kube-controller-manager"#启动静态pod controller-manager [control-plane] Creating static Pod manifest for "kube-scheduler"#启动静态pod [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"#启动静态pod [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 13.504364 seconds#健康检查完毕 [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace#使用cm文件 [kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster#创建一个kubelet的cm [upload-certs] Skipping phase. Please see --upload-certs [mark-control-plane] Marking the node master as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]#给节点打了标签 [mark-control-plane] Marking the node master as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] #给master节点打了一个污点 [bootstrap-token] Using token: b1zldq.89t1aea8szja9d7l #token的使用 [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes #rbac系统建立 [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials#rbac系统建立 [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token #rbac系统建立 [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace #使用cm保存集群信息 [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key #指定kubelet的证书 [addons] Applied essential addon: CoreDNS #部署组件coredns [addons] Applied essential addon: kube-proxy#部署组件kube-proxy Your Kubernetes control-plane has initialized successfully!#控制盘已经建立完成 To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube#固化环境变量 sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf#临时环境变量 You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ #建议你apply一个测试文件,做集群的测试工作 Then you can join any number of worker nodes by running the following on each as root: #工作节点加入命令 kubeadm join 192.168.217.19:6443 --token b1zldq.89t1aea8szja9d7l \ --discovery-token-ca-cert-hash sha256:6ac4ccaf392e4173b7fd9c09cebfd0e2d7eb5ff5a826f39409701fe012ad2ba4
方式二---配置文件初始化:
此文件名称为kubeadm-init.yaml,可以通过kubeadm 命令生成模板文件,命令为:
kubeadm config print init-defaults > kubeadm-init.yaml
此模板文件需要修改8个地方:
- ttl: 24h0m0s 修改为ttl: "0" 这样初始化的token不会过期
- advertiseAddress: 1.2.3.4 修改为advertiseAddress: 192.168.217.19 这个IP是master节点的IP
- name;node 修改为name:master 也就是修改为master节点的主机名,我的master节点的主机名是master
- dns: {} 修改为dns: type: CoreDNS 指定集群的DNS类型
- imageRepository: k8s.gcr.io修改为阿里云的镜像站点---registry.aliyuncs.com/google_containers,这样可以提高下载速度,也就是镜像的本地化
- podSubnet: "10.244.0.0/16" 这个是增加的,原模板文件里没有这个,等同于设置apiserver里的--pod-network-cidr,这个网段是pod使用的。
- serviceSubnet: "" 这里可以不设置,默认就是10.96.0.0/12 此网段是service这个资源使用的。
- kubernetesVersion: 1.22.2 这个是kubeadm,kubelet的版本号
版本号的查询:
[root@master ~]# kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.2", GitCommit:"8b5a19147530eaac9476b0ab82980b4088bbc1b2", GitTreeState:"clean", BuildDate:"2021-09-15T21:37:34Z", GoVersion:"go1.16.8", Compiler:"gc", Platform:"linux/amd64"}
初始化config配置文件示例---kubeadm-init.yaml:
apiVersion: kubeadm.k8s.io/v1beta3 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: "0" usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.217.19 bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock imagePullPolicy: IfNotPresent name: master taints: null --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta3 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controllerManager: {} dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcds imageRepository: registry.aliyuncs.com/google_containers kind: ClusterConfiguration kubernetesVersion: 1.22.2 networking: dnsDomain: cluster.local podSubnet: "10.244.0.0/16" serviceSubnet: "" scheduler: {}
使用配置文件初始化集群:
kubeadm init --config=kubeadm-init.yaml
工作节点的加入:
方式一---命令行增加工作节点在20和21节点,执行此命令:
kubeadm join 192.168.217.19:6443 --token b1zldq.89t1aea8szja9d7l \ --discovery-token-ca-cert-hash sha256:6ac4ccaf392e4173b7fd9c09cebfd0e2d7eb5ff5a826f39409701fe012ad2ba4
此命令输出如下:
[root@node1 ~]# kubeadm join 192.168.217.19:6443 --token b1zldq.89t1aea8szja9d7l \ > --discovery-token-ca-cert-hash sha256:6ac4ccaf392e4173b7fd9c09cebfd0e2d7eb5ff5a826f39409701fe012ad2ba4 [preflight] Running pre-flight checks [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service' [preflight] Reading configuration from the cluster... [preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml' [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Starting the kubelet [kubelet-start] Waiting for the kubelet to perform the TLS Bootstrap... This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the control-plane to see this node join the cluster.
方式二---kubeadm config 配置文件方式增加工作节点
生成join工作节点的模板文件:
kubeadm config print join-defaults >kubeadm-join.yaml
编辑文件kubeadm-join.yaml,如下地方需要修改:
- apiServerEndpoint:连接apiserver的地址,即master的api地址,这里可以改为192.168.0.1:6443,如果master集群部署的话,这里需要改为集群vip地址
- token及tlsBootstrapToken:连接master使用的token,这里需要与master上的InitConfiguration中的token配置一致
- name:node节点的名称,如果使用主机名,需要确保master节点可以解析该主机名。否则的话可直接使用ip地址
本例中的kubeadm-join.yaml
token和tlsBootstrapToken 的值都是上面init命令输出的最后一行的token,此文件在工作节点运行,运行命令为:
kubeadm join --config=kubeadm-join.yaml
示例文件内容如下:
apiVersion: kubeadm.k8s.io/v1beta3 caCertPath: /etc/kubernetes/pki/ca.crt kind: JoinConfiguration discovery: bootstrapToken: apiServerEndpoint: 192.168.217.19:6443 token: b1zldq.89t1aea8szja9d7l unsafeSkipCAVerification: true t1sBootstrapToken: b1zldq.89t1aea8szja9d7l
五,极为简单的网络插件部署
在主节点master执行:
kubectl apply -f kube-flannel.yml chmod a+x flannel cp flannel /opt/cni/bin/ scp flannel node1:/opt/cni/bin/ scp flannel node2:/opt/cni/bin/
在三个节点都重启kubelet服务:
systemctl restart kubelet
此时,在master节点查询节点状态:
[root@master ~]# kubectl get no NAME STATUS ROLES AGE VERSION master Ready control-plane,master 26m v1.22.2 node1 Ready <none> 25m v1.22.2 node2 Ready <none> 25m v1.22.2
六,集群的一个小bug修复及集群的功能测试
小bug修复:
编辑文件/etc/kubernetes/manifests/kube-controller-manager.yaml 删除- --port=0 这一行 编辑文件/etc/kubernetes/manifests/kube-scheduler.yaml 删除- --port=0 这一行 重启kubelet服务:systemctl restart kubelet
看看集群的健康状态:
[root@master ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok etcd-0 Healthy {"health":"true","reason":""} scheduler Healthy ok
看看集群的pod是否正常,service的clusterIP是否正常:
[root@master ~]# kubectl get po,svc -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system pod/coredns-7f6cbbb7b8-dcnpf 1/1 Running 0 109m kube-system pod/coredns-7f6cbbb7b8-hg5t8 1/1 Running 0 109m kube-system pod/etcd-master 1/1 Running 0 109m kube-system pod/kube-apiserver-master 1/1 Running 0 109m kube-system pod/kube-controller-manager-master 1/1 Running 0 56s kube-system pod/kube-flannel-ds-22k8b 1/1 Running 0 94m kube-system pod/kube-flannel-ds-mgvsj 1/1 Running 0 94m kube-system pod/kube-flannel-ds-v8ml5 1/1 Running 0 94m kube-system pod/kube-proxy-hstwd 1/1 Running 0 107m kube-system pod/kube-proxy-sqmfq 1/1 Running 0 107m kube-system pod/kube-proxy-z2cmx 1/1 Running 0 109m kube-system pod/kube-scheduler-master 1/1 Running 0 111s NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 109m kube-system service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 109m
生成一个nginx的pod,看此pod能否正常部署:
[root@master ~]# kubectl create deploy nginx --image=nginx:1.20 deployment.apps/nginx created [root@master ~]# kubectl get po NAME READY STATUS RESTARTS AGE nginx-7fb9867-ssqsr 0/1 ContainerCreating 0 9s [root@master ~]# kubectl get po NAME READY STATUS RESTARTS AGE nginx-7fb9867-ssqsr 0/1 ContainerCreating 0 11s
总结:
此次实践需要指出的是,这种方式部署的kubernetes集群是只能做测试用的,因为,etcd只是单例,不是高可用集群,apiserver也不是ha高可用。后续会给出一个可用于生产的高可用kubeadm版本集群。
附:
在线安装kubeadm方式部署的kubernetes集群:
cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
其它的步骤基本一样,没有什么需要改的,只是在线的方式会比较慢,因为镜像需要一个个慢慢下载,前面的初始化集群命令那的kubernetes的版本需要更改,例如:
yum 安装的是kubernetes-1.23.9:
yum install -y kubeadm-1.23.9 kubelet-1.23.9 kubectl-1.23.9 conntrack-tools libseccomp \ libtool-ltdl device-mapper-persistent-data lvm2
那么,初始化命令需要修改版本号为:
kubeadm init \ --apiserver-advertise-address=192.168.217.19 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.23.9 \ --service-cidr=10.96.0.0/12 \ --pod-network-cidr=10.244.0.0/16