### 搭建K8S环境有几种常见的方式如下: ``` **(1)Minikube Minikube是一个工具,可以在本地快速运行一个单点的K8S,供初步尝试K8S或日常开发的用户使用,不能用于生产环境。 ** (2)Kubeadm Kubeadm是K8S官方社区推出的一套用于简化快速部署K8S集群的工具,Kubeadm的设计目的是为新用户开始尝试K8S提供一种简单的方法。 **(3)二进制包 除了以上两种方式外,我们还可以通过从官方下载二进制包,手动部署每个组件组成K8S集群,这也是目前企业生产环境中广为使用的方式,但对K8S管理人员的要求较高。 本次学习实践我们主要借助Kubeadm工具搭建K8S集群,以便后续实践部署ASP.NET Core应用集群。 ``` ### 搭建前的准备工作 (1)准备三台Linux服务器 这里我选择通过VMware Workstaion来搭建3个虚拟机,每个配置2CPU和2G内存。 (2)配置主机名与静态IP地址如下表所示: 角色 主机名 IP地址 Master master 192.168.127.134 Node node1 192.168.127.135 Node node2 192.168.127.136 ############### master上执行 ############### ### 安装vim yum -y install vim* ### 更改hosts文件添加主机名与IP映射关系 master 192.168.127.134 node1 192.168.127.135 node2 192.168.127.136 ### 设置域名解析服务器-直接设置成根域名解析 [root@localhost project]# vim /etc/resolv.conf 修改成: nameserver 114.114.114.114 ### 设置主机名称 master ** 设置主机名称为 master [root@localhost /]# hostnamectl set-hostname master ** 打印主机名称 [root@localhost /]# hostname master ### 关闭防火墙 systemctl stop firewalld systemctl disable firewalld ### 校正系统时间.系统时间不一致,会导致node节点无法加入集群 ** yum install ntpdate -y ** 网络时间同步命令 ntpdate -u cn.pool.ntp.org CentOS 8中已经无法安装ntpdate ** 配置中添加如下 [root@localhost /]# vim /etc/chrony.conf server 210.72.145.44 iburst server ntp.aliyun.com iburst ** 重新加载配置 [root@localhost /]# systemctl restart chronyd.service ** 时间同步 [root@localhost /]# chronyc sources -v 210 Number of sources = 2 .-- Source mode '^' = server, '=' = peer, '#' = local clock. / .- Source state '*' = current synced, '+' = combined , '-' = not combined, | / '?' = unreachable, 'x' = time may be in error, '~' = time too variable. || .- xxxx [ yyyy ] +/- zzzz || Reachability register (octal) -. | xxxx = adjusted offset, || Log2(Polling interval) --. | | yyyy = measured offset, || \ | | zzzz = estimated error. || | | \ MS Name/IP address Stratum Poll Reach LastRx Last sample =============================================================================== ^? 210.72.145.44 0 6 0 - +0ns[ +0ns] +/- 0ns ^* 203.107.6.88 2 6 7 0 +6519ns[ +654us] +/- 16ms ### 关闭SELinux ** 临时关闭 setenforce 0 ** 修改配置文件 ** 将SELINUX=enforcing改为SELINUX=disabled [root@localhost /]# vim /etc/selinux/config [root@localhost /]# ### 关闭swap分区 ,K8S中不支持swap分区 ** 若swap行都显示 0 则表示关闭成功 ** swap相关行 /mnt/swap swap swap defaults 0 0 这一行或者注释掉这一行 [root@localhost /]# vim /etc/selinux/config [root@localhost /]# swapoff -a [root@localhost /]# vim /etc/fstab [root@localhost /]# free -m total used free shared buff/cache available Mem: 3711 2191 231 32 1288 1242 ### 将桥接的IPv4流量传递到iptables的链 [root@bogon ~]# vim /etc/sysctl.d/k8s.conf #添加如下: net.bridge.bridge-nf-call-ip6tables=1 net.bridge.bridge-nf-call-iptables=1 ### 添加完执行: [root@bogon ~]# sysctl --system ### 安装 wget yum -y install wget ### 添加阿里云Yum软件源 [root@bogon ~]# vim /etc/yum.repos.d/kubernetes.repo [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=1 repo_gpgcheck=1 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg ### 配置repos CentOS 5 wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-5.repo CentOS 6 wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-6.repo CentOS 7 wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo CentOS 8 wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-8.repo ### 清除文件并重建元数据缓存 yum clean all yum makecache ### 安装Kubeadm&Kubelet&Kubectl yum install -y kubelet-1.13.3 kubeadm-1.13.3 kubectl-1.13.3 systemctl enable kubelet ### 碰到需要kubernetes-cni的问题 ``` ##### 错误:软件包:kubelet-1.13.3-0.x86_64 (kubernetes) 需要:kubernetes-cni = 0.6.0 可用: kubernetes-cni-0.3.0.1-0.07a8a2.x86_64 (kubernetes) kubernetes-cni = 0.3.0.1-0.07a8a2 可用: kubernetes-cni-0.5.1-0.x86_64 (kubernetes) kubernetes-cni = 0.5.1-0 可用: kubernetes-cni-0.5.1-1.x86_64 (kubernetes) kubernetes-cni = 0.5.1-1 可用: kubernetes-cni-0.6.0-0.x86_64 (kubernetes) kubernetes-cni = 0.6.0-0 正在安装: kubernetes-cni-0.7.5-0.x86_64 (kubernetes) kubernetes-cni = 0.7.5-0 您可以尝试添加 --skip-broken 选项来解决该问题 您可以尝试执行:rpm -Va --nofiles --nodigest ##### 解决:手动安装kubernetes-cni对应的版本 yum install -y kubelet-1.13.3 kubeadm-1.13.3 kubectl-1.13.3 kubernetes-cni-0.6.0 ##### 使用yum安装程序时,提示xxx.rpm公钥尚未安装 从 https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg 检索密钥 导入 GPG key 0xA7317B0F: 用户ID : "Google Cloud Packages Automatic Signing Key <gc-team@google.com>" 指纹 : d0bc 747f d8ca f711 7500 d6fa 3746 c208 a731 7b0f 来自 : https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg e3438a5f740b3a907758799c3be2512a4b5c64dbe30352b2428788775c6b359e-kubectl-1.13.3-0.x86_64.rpm 的公钥尚未安装 失败的软件包是:kubectl-1.13.3-0.x86_64 GPG 密钥配置为:https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg ##### 解决:使用 yum install xxx.rpm --nogpgcheck 命令格式跳过公钥检查,比如跳过kubectl和kubeadm的公钥检查如下命令 yum install kubectl-1.13.3-0.x86_64 --nogpgcheck yum install kubeadm-1.13.3-0.x86_64 --nogpgcheck ``` ### 查看kubeadm、kubelet版本 kubelet --version kubeadm version ### 部署Kubernetes Master ,这个过程较长耐心等待 kubeadm init \ --apiserver-advertise-address=192.168.127.134 \ --image-repository registry.aliyuncs.com/google_containers \ --kubernetes-version v1.13.3 \ --service-cidr=10.1.0.0/16 \ --pod-network-cidr=10.244.0.0/16 ``` PS:由于默认拉取镜像地址k8s.gcr.io国内无法访问,这里指定阿里云镜像仓库地址(registry.aliyuncs.com/google_containers)。 官方建议服务器至少2CPU+2G内存,当然内存1G也是可以的,但是会出Warning,建议还是老老实实升2G内存把。 #### 查看所缺的镜像 [root@hecs-356640 ~]# kubeadm config images list I0807 18:05:22.909976 8792 version.go:237] remote version is much newer: v1.22.0; falling back to: stable-1.13 I0807 18:05:32.910196 8792 version.go:94] could not fetch a Kubernetes version from the internet: unable to get URL "https://dl.k8s.io/release/stable-1.13.txt": Get https://storage.googleapis.com/kubernetes-release/release/stable-1.13.txt: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers) I0807 18:05:32.910226 8792 version.go:95] falling back to the local client version: v1.13.3 k8s.gcr.io/kube-apiserver:v1.13.3 k8s.gcr.io/kube-controller-manager:v1.13.3 k8s.gcr.io/kube-scheduler:v1.13.3 k8s.gcr.io/kube-proxy:v1.13.3 k8s.gcr.io/pause:3.1 k8s.gcr.io/etcd:3.2.24 k8s.gcr.io/coredns:1.2.6 kubeadm join 192.168.198.111:6443 --token 0mruyg.dz1zman6uufgh5vt --discovery-token-ca-cert-hash sha256:0eac9042e16b8e410bb8b8e3a34afea5fe7bbd48591a2205d5eb47c1cce6bc32 ##### 常见错误: the kubelet version is higher than the control plane version. ##### 执行: yum -y remove kubelet yum install kubeadm-1.13.3-0.x86_64 --nogpgcheck yum install -y kubelet-1.13.3 --nogpgcheck ``` ### 为了顺利使用kubectl命令,执行以下命令 mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config ### 用kubectl了,当你执行完kubectl get nodes之后 [root@bogon ~]# kubectl get node NAME STATUS ROLES AGE VERSION master Ready master 128m v1.13.3 ### 部署Pod网络插件(CNI)同样,继续在k8s-master上操作 kubectl apply -f \ https://raw.githubusercontent.com/coreos/flannel/a70459be0084506e4ec919aa1c114638878db11b/Documentation/kube-flannel.yml ### 然后通过以下命令验证:全部为Running则OK,其中一个不为Running,比如:Pending、ImagePullBackOff都表明Pod没有就绪 kubectl get pod --all-namespaces [root@bogon ~]# kubectl get pod --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-78d4cf999f-j45tr 0/1 ContainerCreating 0 131m kube-system coredns-78d4cf999f-r98p7 0/1 ContainerCreating 0 131m kube-system etcd-master 1/1 Running 0 131m kube-system kube-apiserver-master 1/1 Running 0 131m kube-system kube-controller-manager-master 1/1 Running 0 131m kube-system kube-flannel-ds-amd64-ncx7q 1/1 Running 25 118m kube-system kube-proxy-c744d 1/1 Running 0 131m kube-system kube-scheduler-master 1/1 Running 0 130m ### 如果其中有的Pod没有Running,可以通过以下命令查看具体错误原因,比如这里我想查看kube-flannel-ds-amd64-8bmbm这个pod的错误信息: kubectl describe pod kube-flannel-ds-amd64-xpd82 -n kube-system ### 在此过程中可能会遇到无法从qury.io拉取flannel镜像从而导致无法正常Running,解决办法如下: docker pull quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 docker tag quay-mirror.qiniu.com/coreos/flannel:v0.11.0-amd64 quay.io/coreos/flannel:v0.10.0-amd64 docker rmi quay-mirror.qiniu.com/coreos/flannell:v0.11.0-amd64 ### 这时,我们再看看master节点的状态就会从NotReady变为Ready: kubectl get nodes ########## 那么,恭喜你,Master节点部署结束了。如果你只想要一个单节点的K8S,那么部署完成了 ##############
加入Kubernetes Node,初始化master的时候有直接复制就可以
### 按照安装Master的步骤来操作。 ### 记住不用初始化。 ### 加入Kubernetes Node,初始化master的时候有直接复制就可以 kubeadm join 192.168.127.134:6443 --token ekqxk2.iiu5wx5bbnbdtxsw --discovery-token-ca-cert-hash \sha256:c50bb83d04f64f4a714b745f04682b27768c1298f331e697419451f3550f2d05 ### 记token就不用在执行这个 这里需要注意的就是,带上在Master节点Init成功后输出的Token。如果找不到了,没关系,可以通过以下命令来查看: kubeadm token list ### 记token就不用在执行这个 注:token默认有效期24小时,过期后使用该命令无法查看,可通过下面到方法修改。 kubeadm token create ### 获取ca证书sha256编码hash值 openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //' ### Node节点上成功join之后会得到以下信息 [root@bogon ~]# kubeadm join 192.168.127.134:6443 --token pudqzx.eg87n024icae54j7 --discovery-token-ca-cert-hash sha256:5a347cb51e648bb0d9f416d48cebad2606db07878a250ec9303abf4eecd11144 [preflight] Running pre-flight checks [WARNING SystemVerification]: this Docker version is not on the list of validated versions: 20.10.7. Latest validated version: 18.06 [WARNING Hostname]: hostname "node1" could not be reached [WARNING Hostname]: hostname "node1": lookup node1 on 114.114.114.114:53: no such host [discovery] Trying to connect to API Server "192.168.127.134:6443" [discovery] Created cluster-info discovery client, requesting info from "https://192.168.127.134:6443" [discovery] Requesting info from "https://192.168.127.134:6443" again to validate TLS against the pinned public key [discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "192.168.127.134:6443" [discovery] Successfully established connection with API Server "192.168.127.134:6443" [join] Reading configuration from the cluster... [join] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.13" ConfigMap in the kube-system namespace [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Activating the kubelet service [tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap... [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node1" as an annotation This node has joined the cluster: * Certificate signing request was sent to apiserver and a response was received. * The Kubelet was informed of the new secure connection details. Run 'kubectl get nodes' on the master to see this node join the cluster. ### 在master节点上执行以下命令可以看到集群各个节点的状态了 [root@bogon ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master Ready master 5h33m v1.13.3 node1 Ready <none> 35m v1.13.3 ### 常见问题 1. 证书已存在 解决: 删除相应目录下的证书文件,重新执行命令 2. 这种提示一般是kubelet版本与kubeadm版本不一致导致,重新安装kubelet即可 Downloading configuration for the kubelet from the "kubelet-config-1.15" ConfigMap in the kube-system namespace configmaps "kubelet-config-1.15" is forbidden: User "system:bootstrap:6w889o" cannot get resource "configmaps" in API group "" in the namespace "kube-system" 解决: yum remove kubelet yum install -y kubelet-1.13.3 kubeadm-1.13.3 kubectl-1.13.3 kubernetes-cni-0.6.0 3. 如果看到两个Node状态不是Ready,那么可能需要检查哪些Pod没有正常运行 kubectl get nodes 如果看到两个Node状态不是Ready,那么可能需要检查哪些Pod没有正常运行: kubectl get pod --all-namespaces 最终kubectl get nodes效果应该状态都是Running。 注意的是在检查时需要注意是哪个Node上的错误,然后在对应的Node进行修复,比如拉取flannel镜像。 ### 至此,一个最小化的K8S集群已经搭建完毕。 ### 测试Kubernetes集群 ### 这里为了快速地验证一下我们的K8S集群是否可用,创建一个示例Pod: kubectl create deployment nginx --image=nginx kubectl expose deployment nginx --port=80 --type=NodePort kubectl get pod,svc #### [root@bogon ~]# kubectl get pod,svg error: the server doesn't have a resource type "svg" [root@bogon ~]# kubectl get pod,svc NAME READY STATUS RESTARTS AGE pod/nginx-5c7588df-v8slr 0/1 ContainerCreating 0 26m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 5h39m service/nginx NodePort 10.1.16.121 <none> 80:32509/TCP 25m ### 如果想要看到更多的信息,比如pod被部署在了哪个Node上,可以通过 kubectl get pods,svc -o wide来查看。 [root@bogon ~]# kubectl get pods,svc -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/nginx-5c7588df-v8slr 0/1 ContainerCreating 0 27m <none> node1 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 5h41m <none> service/nginx NodePort 10.1.16.121 <none> 80:32509/TCP 27m app=nginx ### 访问 因为是NodePort方式,因此其映射暴露出来的端口号会在30000-32767范围内随机取一个, 我们可以直接通过浏览器输入IP地址访问,比如这时我们通过浏览器来访问一下任一Node的IP地址加端口号, 例如http://192.168.127.134:32509/ 或http://192.168.127.135:32509/ ### 集群搭建成功 如果能够成功看到你的K8S集群能够成功运行了
#安装flannel,已经安装就不用安装了 mkdir k8s cd k8s wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml 如果yml中的"Network": "10.244.0.0/16"和kubeadm init xxx --pod-network-cidr不一样,就需要修改成一样的。不然可能会使得Node间Cluster IP不通。 由于我上面的kubeadm init xxx --pod-network-cidr就是10.244.0.0/16。所以此yaml文件就不需要更改了。 # cat kube-flannel.yml |grep image|uniq image: quay.io/coreos/flannel:v0.12.0-amd64 image: quay.io/coreos/flannel:v0.12.0-arm64 image: quay.io/coreos/flannel:v0.12.0-arm image: quay.io/coreos/flannel:v0.12.0-ppc64le image: quay.io/coreos/flannel:v0.12.0-s390x # 手动拉取镜像 docker pull registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-amd64 docker pull registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-arm64 docker pull registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-arm docker pull registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-ppc64le docker pull registry.cn-shanghai.aliyuncs.com/leozhanggg/flannel:v0.12.0-s390x 加载flannel kubectl apply -f kube-flannel.yml [root@hecs-356640 ~]# kubectl get po --all-namespaces NAMESPACE NAME READY STATUS RESTARTS AGE kube-system coredns-6d56c8448f-7svf6 1/1 Running 0 39m kube-system coredns-6d56c8448f-cp55r 1/1 Running 0 39m kube-system etcd-hecs-356640 1/1 Running 0 39m kube-system kube-apiserver-hecs-356640 1/1 Running 0 39m kube-system kube-controller-manager-hecs-356640 1/1 Running 0 39m kube-system kube-flannel-ds-dbfmb 1/1 Running 0 2m32s kube-system kube-proxy-kw88f 1/1 Running 0 39m kube-system kube-scheduler-hecs-356640 1/1 Running 0 39m
### node 上出的问题 原因:kubernetes master没有与本机绑定,集群初始化的时候没有绑定,此时设置在本机的环境变量即可解决问题。 [root@node1 ~]# kubectl get pod --all-namespaces The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@node1 ~]# kubectl get pod --all-namespaces The connection to the server localhost:8080 was refused - did you specify the right host or port? [root@node1 ~]# kubectl get pod,svc -o wide The connection to the server localhost:8080 was refused - did you specify the right host or port? 主节点上执行: 将主节点(master节点)中的【/etc/kubernetes/admin.conf】文件拷贝到从节点相同目录下: 我是直接通过主机之间拷贝,要是不嫌弃麻烦下载上传即可。 [root@master kubernetes]# scp -r admin.conf 192.168.127.135:/etc/kubernetes/ The authenticity of host '192.168.127.135 (192.168.127.135)' can't be established. ECDSA key fingerprint is SHA256:4DUCS5xcZjZvBxAyFTcstQbqEKyFsGQwkCvuoNoMLZk. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '192.168.127.135' (ECDSA) to the list of known hosts. root@192.168.127.135's password: admin.conf 从节点: 具体根据情况,此处记录linux设置该环境变量 方式一:编辑文件设置 vim /etc/profile 在底部增加新的环境变量 export KUBECONFIG=/etc/kubernetes/admin.conf 方式二:直接追加文件内容 echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> /etc/profile source /etc/profile 从节点执行: [root@node1 ~]# kubectl get pod,svc -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES pod/nginx-5c7588df-v8slr 1/1 Running 0 23h 10.244.1.3 node1 <none> <none> NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR service/kubernetes ClusterIP 10.1.0.1 <none> 443/TCP 28h <none> service/nginx NodePort 10.1.16.121 <none> 80:32509/TCP 23h app=nginx