前言:
kubernetes集群的安装部署是学习kubernetes所需要面对的第一个难关,确实是非常不好部署的,尤其是二进制方式,虽然有minikube,kubeadm大大的简化了kubernetes的部署难度,那么,针对我们的学习环境或者测试环境,我们应该如何能够快速的,简单的,非常优雅的部署一个学习或者测试用的kubernetes集群呢?
目前来说,版本答案就是kubekey项目了,也就是kk
该项目针对kubernetes集群的部署难度,极大的降低了kubernetes集群的部署门槛,可以非常迅速的部署单master集群,多master高可用集群,通常部署安装可能至多需要10来分钟(在线安装的话),如果是离线方式安装那么部署时间可缩短到 1 2 分钟。
本文将就kubekey部署kubernetes单master集群做一个简单的描述。
一,
kubekey项目的下载地址
Releases · kubesphere/kubekey · GitHub
大概翻译一下About,该项目的简介说的是,kubekey可以仅安装kubernetes集群,也可以安装kubernetes集群和kubesphere,并且支持多云架构,支持多工作节点,高可用kubernetes集群。
那么,需要注意的是,kubekey由于是kubesphere公司的一个子项目,因此,和kubesphere是没有解耦的,也就是说,要么用kubekey安装kubernetes集群,要么用kubekey安装kubernetes集群和kubesphere,两个同时安装,不能只使用kubekey安装kubesphere
OK,本文将仅仅使用kubekey安装部署一个单master的kubernetes集群(在线部署,注意,不是离线的方式,还没研究出来呢)
作为一个安装工具,当然是使用最新版的比较好了,因为支持的kubernetes版本够多,bug修复的够多,功能也够多嘛
我由于是演示性质,因此随意选择了一个版本下载,本文使用的版本是kubekey-v3.1.0-alpha.0-linux-amd64.tar.gz
可以看到最新版3.0.8还是比较香的,基本都是最新的技术啦,如果想体验高版本的kubernetes的快乐,自然是此版本比较合适的
二,
kubekey使用的先决条件
大概翻译一下
第一,服务器系统需要2核CPU,4G内存,至少20G的磁盘使用空间
第二,ssh服务是正常的,通过ssh服务可以访问到所有节点
第三,curl,openssl命令可以sudo,假如是使用普通用户部署的情况下
第四,docker环境
第五,selinux关闭了或者已做了相关配置,建议是直接关闭selinux
第六,最好是干净的刚安装完毕系统的服务器
第七,需要安装socat,conntrack,这两个是关键依赖,必须安装的,ebtables,ipset,ipvsadm中等依赖,可不安装,但最好安装
总结一哈,在centos7下那就是需要docker环境(可以不安装,等kubekey来安装),时间服务器,sshd,服务器密码,关闭selinux和防火墙,可用的外部yum源,最好是有epel源和基础源。
安装依赖,安装命令为:
yum install conntrack socat ipset ipvsadm ebtables -y
本例使用的是两台服务器的信息是:
IP:192.168.123.11 192.168.123.12 操作系统版本是centos7
[root@node1 ~]# cat /etc/redhat-release CentOS Linux release 7.7.1908 (Core)
那么,我们在kubeadm或者二进制部署的时候,还经常有升级内核这些步骤,为什么kubekey这里没有提到呢? 其实,内核的升级只是让kubernetes集群更加稳定而已,考虑到此次部署工作的目的是测试环境或者学习环境,因此,内核可以不需要升级。
三,
kubekey生成部署配置文件
基本上kubekey和kubeadm是比较相似的,是可以使用配置文件的,也就是配置文件内写如何部署安装kubernetes,然后告诉kubekey吧
在kubekey的官网,我们使用它的高阶部署方式,也就是配置文件方式
####注,kubekey的二进制安装包建议放置在master节点,解压后直接使用即可
生成配置文件
./kk create config [--with-kubernetes version] [--with-kubesphere version] [(-f | --filename) path]
根据示例命令,编写下面这个命令,生成配置文件1.22.yaml
./kk create config with-kubernetes 1.22.16 -f 1.22.yaml
文件内容如下:
[root@node1 ~]# vim 1.22.yaml [root@node1 ~]# cat 1.22.yaml apiVersion: kubekey.kubesphere.io/v1alpha2 kind: Cluster metadata: name: sample spec: hosts: - {name: node1, address: 172.16.0.2, internalAddress: 172.16.0.2, user: ubuntu, password: "Qcloud@123"} - {name: node2, address: 172.16.0.3, internalAddress: 172.16.0.3, user: ubuntu, password: "Qcloud@123"} roleGroups: etcd: - node1 control-plane: - node1 worker: - node1 - node2 controlPlaneEndpoint: ## Internal loadbalancer for apiservers # internalLoadbalancer: haproxy domain: lb.kubesphere.local address: "" port: 6443 kubernetes: version: v1.23.10 clusterName: cluster.local autoRenewCerts: true containerManager: docker etcd: type: kubekey network: plugin: calico kubePodsCIDR: 10.233.64.0/18 kubeServiceCIDR: 10.233.0.0/18 ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni multusCNI: enabled: false registry: privateRegistry: "" namespaceOverride: "" registryMirrors: [] insecureRegistries: [] addons: []
很显然,该文件还是有很多不符合我们的预期的,需要修改,修改后的文件内容如下:
###主要是IP地址,密码和CIDR的修改
apiVersion: kubekey.kubesphere.io/v1alpha2 kind: Cluster metadata: name: sample spec: hosts: - {name: node1, address: 192.168.123.11, internalAddress: 192.168.123.11, user: root, password: "密码"} - {name: node2, address: 192.168.123.12, internalAddress: 192.168.123.12, user: root, password: "密码"} roleGroups: etcd: - node1 control-plane: - node1 worker: - node1 - node2 controlPlaneEndpoint: ## Internal loadbalancer for apiservers # internalLoadbalancer: haproxy domain: lb.kubesphere.local address: "" port: 6443 kubernetes: version: 1.22.16 clusterName: cluster.local autoRenewCerts: true containerManager: docker etcd: type: kubekey network: plugin: calico kubePodsCIDR: 10.244.0.0/24 kubeServiceCIDR: 10.96.0.0/24 ## multus support. https://github.com/k8snetworkplumbingwg/multus-cni multusCNI: enabled: false registry: privateRegistry: "" namespaceOverride: "" registryMirrors: [] insecureRegistries: [] addons: []
四,
kubekey应用修改后的配置文件开始正式部署
./kk create cluster -f 1.22.yaml
在开始部署前,由于防火墙的原因,我们需要增加一个环境变量,使用国内的镜像等等,也就是国产化
export KKZONE=cn
命令的输出大概如下:
####注,表格内全是y就可以了,直接输入yes,开始安装
[root@centos1 ~]# ./kk create cluster -f 123.yaml _ __ _ _ __ | | / / | | | | / / | |/ / _ _| |__ ___| |/ / ___ _ _ | \| | | | '_ \ / _ \ \ / _ \ | | | | |\ \ |_| | |_) | __/ |\ \ __/ |_| | \_| \_/\__,_|_.__/ \___\_| \_/\___|\__, | __/ | |___/ 11:52:54 CST [GreetingsModule] Greetings 11:52:54 CST message: [node2] Greetings, KubeKey! 11:52:55 CST message: [node1] Greetings, KubeKey! 11:52:55 CST success: [node2] 11:52:55 CST success: [node1] 11:52:55 CST [NodePreCheckModule] A pre-check on nodes 11:53:01 CST success: [node2] 11:53:01 CST success: [node1] 11:53:01 CST [ConfirmModule] Display confirmation form +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ | name | sudo | curl | openssl | ebtables | socat | ipset | ipvsadm | conntrack | chrony | docker | containerd | nfs client | ceph client | glusterfs client | time | +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ | node1 | y | y | y | y | y | y | | y | | | | | | | CST 11:53:01 | | node2 | y | y | y | y | y | y | | y | | | | | | | CST 11:52:55 | +-------+------+------+---------+----------+-------+-------+---------+-----------+--------+--------+------------+------------+-------------+------------------+--------------+ This is a simple check of your environment. Before installation, ensure that your machines meet all requirements specified at https://github.com/kubesphere/kubekey#requirements-and-recommendations Continue this installation? [yes/no]:
yes后,开始下载组件并执行安装了,可以看到下载了kubeadm,并且用了很多脚本:
11:54:24 CST success: [LocalHost] 11:54:24 CST [NodeBinariesModule] Download installation binaries 11:54:24 CST message: [localhost] downloading amd64 kubeadm v1.22.16 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 43.7M 100 43.7M 0 0 998k 0 0:00:44 0:00:44 --:--:-- 1031k 11:55:09 CST message: [localhost] downloading amd64 kubelet v1.22.16 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 115M 100 115M 0 0 1017k 0 0:01:56 0:01:56 --:--:-- 1078k 11:57:06 CST message: [localhost] downloading amd64 kubectl v1.22.16 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 44.7M 100 44.7M 0 0 1017k 0 0:00:45 0:00:45 --:--:-- 1151k 11:57:51 CST message: [localhost] downloading amd64 helm v3.9.0 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 44.0M 100 44.0M 0 0 1012k 0 0:00:44 0:00:44 --:--:-- 1082k 11:58:36 CST message: [localhost] downloading amd64 kubecni v1.2.0 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 38.6M 100 38.6M 0 0 1008k 0 0:00:39 0:00:39 --:--:-- 1143k 11:59:16 CST message: [localhost] downloading amd64 crictl v1.24.0 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 13.8M 100 13.8M 0 0 1044k 0 0:00:13 0:00:13 --:--:-- 1154k 11:59:29 CST message: [localhost] downloading amd64 etcd v3.4.13 ... % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 16.5M 100 16.5M 0 0 1012k 0 0:00:16 0:00:16 --:--:-- 1066k 11:59:46 CST message: [localhost] downloading amd64 docker 20.10.8 ...
最终的输出如下:
poddisruptionbudget.policy/calico-kube-controllers created 12:05:00 CST success: [node1] 12:05:00 CST [ConfigureKubernetesModule] Configure kubernetes 12:05:00 CST success: [node1] 12:05:00 CST [ChownModule] Chown user $HOME/.kube dir 12:05:00 CST success: [node2] 12:05:00 CST success: [node1] 12:05:00 CST [AutoRenewCertsModule] Generate k8s certs renew script 12:05:01 CST success: [node1] 12:05:01 CST [AutoRenewCertsModule] Generate k8s certs renew service 12:05:02 CST success: [node1] 12:05:02 CST [AutoRenewCertsModule] Generate k8s certs renew timer 12:05:02 CST success: [node1] 12:05:02 CST [AutoRenewCertsModule] Enable k8s certs renew service 12:05:03 CST success: [node1] 12:05:03 CST [SaveKubeConfigModule] Save kube config as a configmap 12:05:03 CST success: [LocalHost] 12:05:03 CST [AddonsModule] Install addons 12:05:03 CST success: [LocalHost] 12:05:03 CST Pipeline[CreateClusterPipeline] execute successfully Installation is complete. Please check the result using the command: kubectl get pod -A
非常简单的,等待10来分钟一个可用的kubernetes-1.22.16版本的集群就部署好了
五,
小结:
kubekey到底在kubernetes集群安装中做了些什么工作呢?这样的集群有什么缺陷吗?
1,
kubekey大体做的有下载二进制kubernetes的部分组件,例如,kubelet,kubeadm,helm,还有一些脚本,配置文件等等
具体的目录在kubekey这个文件夹下:
[root@node1 kubekey]# ll total 12 drwxr-xr-x. 3 root root 20 Jul 16 11:58 cni -rw-r--r--. 1 root root 5667 Jul 16 12:04 config-sample drwxr-xr-x. 3 root root 21 Jul 16 11:59 crictl drwxr-xr-x. 3 root root 21 Jul 16 11:59 docker drwxr-xr-x. 3 root root 21 Jul 16 11:59 etcd drwxr-xr-x. 3 root root 20 Jul 16 11:57 helm drwxr-xr-x. 3 root root 22 Jul 16 11:54 kube drwxr-xr-x. 2 root root 53 Jul 16 11:52 logs drwxr-xr-x. 2 root root 4096 Jul 16 12:37 node1 drwxr-xr-x. 2 root root 137 Jul 16 12:04 node2 drwxr-xr-x. 3 root root 18 Jul 16 12:03 pki
更多安装细节在logs目录下的日志文件内,感兴趣的同学可以去研究研究,其实,初始化系统那个脚本是值得一看的
[root@node1 node1]# ls 10-kubeadm.conf backup-etcd.timer daemon.json etcd-backup.sh etcd.service k8s-certs-renew.service k8s-certs-renew.timer kubelet.service nodelocaldnsConfigmap.yaml backup-etcd.service coredns-svc.yaml docker.service etcd.env initOS.sh k8s-certs-renew.sh kubeadm-config.yaml network-plugin.yaml nodelocaldns.yaml [root@node1 node1]# pwd /root/kubekey/node1
可以看到,初始化脚本关闭了防火墙,selinux并做了内核优化这些工作
[root@node1 node1]# cat initOS.sh #!/usr/bin/env bash # Copyright 2020 The KubeSphere Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. swapoff -a sed -i /^[^#]*swap*/s/^/\#/g /etc/fstab # See https://github.com/kubernetes/website/issues/14457 if [ -f /etc/selinux/config ]; then sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config fi # for ubuntu: sudo apt install selinux-utils # for centos: yum install selinux-policy if command -v setenforce &> /dev/null then setenforce 0 getenforce fi echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf echo 'net.bridge.bridge-nf-call-arptables = 1' >> /etc/sysctl.conf echo 'net.bridge.bridge-nf-call-ip6tables = 1' >> /etc/sysctl.conf echo 'net.bridge.bridge-nf-call-iptables = 1' >> /etc/sysctl.conf echo 'net.ipv4.ip_local_reserved_ports = 30000-32767' >> /etc/sysctl.conf echo 'vm.max_map_count = 262144' >> /etc/sysctl.conf echo 'vm.swappiness = 1' >> /etc/sysctl.conf echo 'fs.inotify.max_user_instances = 524288' >> /etc/sysctl.conf echo 'kernel.pid_max = 65535' >> /etc/sysctl.conf #See https://imroc.io/posts/kubernetes/troubleshooting-with-kubernetes-network/ sed -r -i "s@#{0,}?net.ipv4.tcp_tw_recycle ?= ?(0|1)@net.ipv4.tcp_tw_recycle = 0@g" /etc/sysctl.conf sed -r -i "s@#{0,}?net.ipv4.ip_forward ?= ?(0|1)@net.ipv4.ip_forward = 1@g" /etc/sysctl.conf sed -r -i "s@#{0,}?net.bridge.bridge-nf-call-arptables ?= ?(0|1)@net.bridge.bridge-nf-call-arptables = 1@g" /etc/sysctl.conf sed -r -i "s@#{0,}?net.bridge.bridge-nf-call-ip6tables ?= ?(0|1)@net.bridge.bridge-nf-call-ip6tables = 1@g" /etc/sysctl.conf sed -r -i "s@#{0,}?net.bridge.bridge-nf-call-iptables ?= ?(0|1)@net.bridge.bridge-nf-call-iptables = 1@g" /etc/sysctl.conf sed -r -i "s@#{0,}?net.ipv4.ip_local_reserved_ports ?= ?([0-9]{1,}-{0,1},{0,1}){1,}@net.ipv4.ip_local_reserved_ports = 30000-32767@g" /etc/sysctl.conf sed -r -i "s@#{0,}?vm.max_map_count ?= ?([0-9]{1,})@vm.max_map_count = 262144@g" /etc/sysctl.conf sed -r -i "s@#{0,}?vm.swappiness ?= ?([0-9]{1,})@vm.swappiness = 1@g" /etc/sysctl.conf sed -r -i "s@#{0,}?fs.inotify.max_user_instances ?= ?([0-9]{1,})@fs.inotify.max_user_instances = 524288@g" /etc/sysctl.conf sed -r -i "s@#{0,}?kernel.pid_max ?= ?([0-9]{1,})@kernel.pid_max = 65535@g" /etc/sysctl.conf tmpfile="$$.tmp" awk ' !x[$0]++{print > "'$tmpfile'"}' /etc/sysctl.conf mv $tmpfile /etc/sysctl.conf systemctl stop firewalld 1>/dev/null 2>/dev/null systemctl disable firewalld 1>/dev/null 2>/dev/null systemctl stop ufw 1>/dev/null 2>/dev/null systemctl disable ufw 1>/dev/null 2>/dev/null modinfo br_netfilter > /dev/null 2>&1 if [ $? -eq 0 ]; then modprobe br_netfilter mkdir -p /etc/modules-load.d echo 'br_netfilter' > /etc/modules-load.d/kubekey-br_netfilter.conf fi modinfo overlay > /dev/null 2>&1 if [ $? -eq 0 ]; then modprobe overlay echo 'overlay' >> /etc/modules-load.d/kubekey-br_netfilter.conf fi modprobe ip_vs modprobe ip_vs_rr modprobe ip_vs_wrr modprobe ip_vs_sh cat > /etc/modules-load.d/kube_proxy-ipvs.conf << EOF ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh EOF modprobe nf_conntrack_ipv4 1>/dev/null 2>/dev/null if [ $? -eq 0 ]; then echo 'nf_conntrack_ipv4' > /etc/modules-load.d/kube_proxy-ipvs.conf else modprobe nf_conntrack echo 'nf_conntrack' > /etc/modules-load.d/kube_proxy-ipvs.conf fi sysctl -p sed -i ':a;$!{N;ba};s@# kubekey hosts BEGIN.*# kubekey hosts END@@' /etc/hosts sed -i '/^$/N;/\n$/N;//D' /etc/hosts cat >>/etc/hosts<<EOF # kubekey hosts BEGIN 192.168.123.11 node1.cluster.local node1 192.168.123.12 node2.cluster.local node2 192.168.123.11 lb.kubesphere.local # kubekey hosts END EOF echo 3 > /proc/sys/vm/drop_caches # Make sure the iptables utility doesn't use the nftables backend. update-alternatives --set iptables /usr/sbin/iptables-legacy >/dev/null 2>&1 || true update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy >/dev/null 2>&1 || true update-alternatives --set arptables /usr/sbin/arptables-legacy >/dev/null 2>&1 || true update-alternatives --set ebtables /usr/sbin/ebtables-legacy >/dev/null 2>&1 || true ulimit -u 65535 ulimit -n 65535
2,
kubekey默认安装的kubernetes有哪些地方是不合理的呢?
我认为etcd这个组件的处理事比较差的,因为只是一个单示例etcd,这样的集群是无法用在生产上的。虽然是外部etcd,但不是集群,无法保证集群的稳定
[root@node1 node1]# kubectl get po -A NAMESPACE NAME READY STATUS RESTARTS AGE kube-system calico-kube-controllers-69cfcfdf6c-2dl2z 1/1 Running 0 41m kube-system calico-node-8l2vk 1/1 Running 0 41m kube-system calico-node-plbbn 1/1 Running 0 41m kube-system coredns-5495dd7c88-7746t 1/1 Running 0 42m kube-system coredns-5495dd7c88-gzxl2 1/1 Running 0 42m kube-system kube-apiserver-node1 1/1 Running 0 42m kube-system kube-controller-manager-node1 1/1 Running 0 42m kube-system kube-proxy-ld97n 1/1 Running 0 42m kube-system kube-proxy-q7zzm 1/1 Running 0 41m kube-system kube-scheduler-node1 1/1 Running 0 42m kube-system nodelocaldns-9l8lf 1/1 Running 0 42m kube-system nodelocaldns-hw4tn 1/1 Running 0 41m
[root@node1 node1]# systemctl status etcd ● etcd.service - etcd Loaded: loaded (/etc/systemd/system/etcd.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2023-07-16 12:03:33 CST; 43min ago Main PID: 4872 (etcd) Tasks: 15 Memory: 50.6M CGroup: /system.slice/etcd.service └─4872 /usr/local/bin/etcd Jul 16 12:24:24 node1 etcd[4872]: store.index: compact 1888 Jul 16 12:24:24 node1 etcd[4872]: finished scheduled compaction at 1888 (took 400.469µs) Jul 16 12:29:24 node1 etcd[4872]: store.index: compact 2279 Jul 16 12:29:24 node1 etcd[4872]: finished scheduled compaction at 2279 (took 351.415µs) Jul 16 12:34:24 node1 etcd[4872]: store.index: compact 2672 Jul 16 12:34:24 node1 etcd[4872]: finished scheduled compaction at 2672 (took 403.899µs) Jul 16 12:39:24 node1 etcd[4872]: store.index: compact 3063 Jul 16 12:39:24 node1 etcd[4872]: finished scheduled compaction at 3063 (took 355.549µs) Jul 16 12:44:24 node1 etcd[4872]: store.index: compact 3455 Jul 16 12:44:24 node1 etcd[4872]: finished scheduled compaction at 3455 (took 346.379µs)
其它的地方kubekey表现基本是完美的(kubekey是支持高可用kubernetes集群安装的,但本例没有使用)
下一篇文章讲述如何使用kubekey部署一个高可用的kubernetes集群,并修正上述的etcd问题。