0. 前言
k8s从1.24版本开始不再直接支持docker,但可以自行调整相关配置,实现1.24版本后的k8s还能调用docker。其实docker自身也是调用containerd,与其k8s通过docker再调用containerd,不如k8s直接调用containerd,以减少性能损耗。
除了containerd,比较流行的容器运行时还有podman,但是podman官方安装文档要么用包管理器在线安装,要么用包管理器下载一堆依赖再编译安装,内网离线环境下安装可能会比较麻烦,而containerd的安装包是静态二进制文件,解压后就能直接使用,离线环境下相对方便一点。
本文将在内网离线环境下用二进制文件部署一个三节点集群+harbor镜像仓库。集群中部署了三个apiserver,并配置nginx反向代理,提升master的高可用性(如对高可用有进一步要求,可以再加个keepalive)。
相关软件信息:
名称 | 版本 | 说明 |
containerd | cri-containerd-cni-1.7.2-linux-amd64 | 容器运行时 |
harbor | 2.8.2 | 容器镜像仓库 |
etcd | 3.4.24 | 键值对数据库 |
kubernetes | 1.26.6 | 容器编排系统 |
nginx | 1.25.1 | 负载均衡,反向代理apiserver |
服务器信息:
IP | 操作系统 | 硬件配置 | Hostname | 说明 |
192.168.3.31 | Debian 11.6 amd64 | 4C4G | k8s31 | nginx+etcd+master+node |
192.168.3.32 | Debian 11.6 amd64 | 4C4G | k8s32 | etcd+master+node |
192.168.3.33 | Debian 11.6 amd64 | 4C4G | k8s33 | etcd+master+node |
192.168.3.43 | Debian 11.6 amd64 | 4C4G | 无 | harbor,内网域名registry.atlas.cn |
1. 系统初始化
初始化部分需要三台k8s节点主机都执行, 根据实际情况修改参数。
- 修改主机名
# 3.31服务器 hostnamectl set-hostname k8s31 # 3.32服务器 hostnamectl set-hostname k8s32 # 3.33服务器 hostnamectl set-hostname k8s33
- 修改/etc/hosts文件,增加以下配置。
192.168.3.31 k8s31 192.168.3.32 k8s32 192.168.3.33 k8s33
- 配置时间同步服务
# 1. 安装chrony时间同步应用 apt install -y chrony # 2. 添加内网的ntp服务器地址。如果在公网,可配置阿里云的ntp服务器地址:ntp.aliyun.com echo 'server 192.168.3.41 iburst' > /etc/chrony/sources.d/custom-ntp-server.sources # 3. 启动 systemctl start chrony # 如果已经启动过了, 可以热加载配置: chronyc reload sources
- 关闭swap。如果安装系统时创建了swap,则需要关闭。
swapoff -a
- 装载内核模块
# 添加配置 cat << EOF > /etc/modules-load.d/containerd.conf overlay br_netfilter EOF # 立即装载 modprobe overlay modprobe br_netfilter # 检查装载。如果没有输出结果说明没有装载。 lsmod | grep br_netfilter
- 配置系统参数
# 1. 添加配置文件 cat << EOF > /etc/sysctl.d/k8s-sysctl.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 user.max_user_namespaces=28633 vm.swappiness = 0 EOF # 2. 配置生效 sysctl -p /etc/sysctl.d/k8s-sysctl.conf
- 启用ipvs。编写system配置文件,实现开机自动装载到内核。
# 1. 安装依赖 apt install -y ipset ipvsadm # 2. 新建文件并添加配置, 文件路径可任意 tee /root/scripts/k8s.sh <<EOF modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack EOF # 3. 新建system文件, 实现开机执行脚本 cat << EOF > /etc/systemd/system/myscripts.service [Unit] Description=Run a Custom Script at Startup After=default.target [Service] ExecStart=sh /root/scripts/k8s.sh [Install] WantedBy=default.target EOF # 4. 加载并启用system systemctl daemon-reload systemctl enable myscripts
2. 部署harbor镜像仓库
由于部署集群时候需要先拉取一些镜像,所以harbor在集群外的一个服务器单独部署。官方安装脚本使用了docker,所以harbor节点需要先安装docker,安装步骤可参考 博客园 - linux离线安装docker与compose。
harbor的安装步骤可参考 博客园 - centos离线安装harbor,这里大致写一下。
harbor的github release GitHub - goharbor/harbor/releases 提供离线安装包,要先下载好,然后解压。
- 创建ssl证书
mkdir certs # 创建服务器证书密钥文件harbor.key openssl genrsa -des3 -out harbor.key 2048 # 输入密码,确认密码,自己随便定义,但是要记住,后面会用到。 # 创建服务器证书的申请文件harbor.csr openssl req -new -key harbor.key -out harbor.csr # 输入密钥文件的密码, 然后一路回车 # 备份一份服务器密钥文件 cp harbor.key harbor.key.org # 去除文件口令 openssl rsa -in harbor.key.org -out harbor.key # 输入密钥文件的密码 # 创建一个自当前日期起为期十年的证书 harbor.crt openssl x509 -req -days 3650 -in harbor.csr -signkey harbor.key -out harbor.crt
- 修改配置文件 harbor.yml,仅列出自修改项。数据存储目录和日志目录自定义了。
hostname: 192.168.3.43 certificate: /home/atlas/apps/harbor/certs/harbor.crt private_key: /home/atlas/apps/harbor/certs/harbor.key # admin用户登录密码 harbor_admin_password: Harbor2023 # 数据卷目录 data_volume: /home/atlas/apps/harbor/data # 日志目录 location: /home/atlas/apps/harbor/logs/
- 执行安装脚本
./install.sh
- 浏览器访问 https://192.168.3.43 ,测试能否正常登录访问harbor。
3. 安装containerd
- 从GitHub https://github.com/containerd/containerd/releases 下载二进制包
- 解压压缩包
tar xf cri-containerd-cni-1.7.2-linux-amd64.tar.gz -C /
- 生成 containerd 配置文件
mkdir /etc/containerd containerd config default > /etc/containerd/config.toml
- 编辑
/etc/containerd/config.toml
,修改以下内容
# 修改数据存储目录 root = "/home/apps/containerd" # 对于使用systemd作为init system的linux发行版,官方建议用systemd作为容器cgroup driver # false改成true SystemdCgroup = true # 修改pause镜像下载地址,这里用的是内网域名地址 sandbox_image = "registry.atlas.cn/public/pause:3.9" # 私有harbor的连接信息 [plugins."io.containerd.grpc.v1.cri".registry.configs."registry.atlas.cn"] [plugins."io.containerd.grpc.v1.cri".registry.configs."registry.atlas.cn".tls] insecure_skip_verify = true [plugins."io.containerd.grpc.v1.cri".registry.configs."registry.atlas.cn".auth] username = "admin" password = "Harbor2023"
- 重加载systemd配置,启动containerd
systemctl daemon-reload systemctl start containerd
4. 生成ca证书
后面的k8s和etcd集群都会用到ca证书。如果组织能提供统一的CA认证中心,则直接使用组织颁发的CA证书即可。如果没有统一的CA认证中心,则可以通过颁发自签名的CA证书来完成安全配置。这里自行生成一个ca证书。
# 生成私钥文件ca.key openssl genrsa -out ca.key 2048 # 根据私钥文件生成根证书文件ca.crt # /CN为master的主机名或IP地址 # days为证书的有效期 openssl req -x509 -new -nodes -key ca.key -subj "/CN=192.168.3.31" -days 36500 -out ca.crt # 拷贝ca证书到/etc/kubernetes/pki mkdir -p /etc/kubernetes/pki cp ca.crt ca.key /etc/kubernetes/pki/
5. 部署etcd集群
部署一个三节点etcd集群,集群间使用https协议加密通信。etcd的安装包可以从官网下载,下载后解压,将压缩包中的etcd
和etcdctl
放到/usr/local/bin
目录。
- 编辑文件
etcd_ssl.cnf
。IP地址为etcd节点。
[ req ] req_extensions = v3_req distinguished_name = req_distinguished_name [ req_distinguished_name ] [ v3_req ] basicConstraints = CA:FALSE keyUsage = nonRepudiation, digitalSignature, keyEncipherment subjectAltName = @alt_names [ alt_names ] IP.1 = 192.168.3.31 IP.2 = 192.168.3.32 IP.3 = 192.168.3.33
- 创建etcd服务端证书
openssl genrsa -out etcd_server.key 2048 openssl req -new -key etcd_server.key -config etcd_ssl.cnf -subj "/CN=etcd-server" -out etcd_server.csr openssl x509 -req -in etcd_server.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 3650 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_server.crt
- 创建etcd客户端证书
openssl genrsa -out etcd_client.key 2048 openssl req -new -key etcd_client.key -config etcd_ssl.cnf -subj "/CN=etcd-client" -out etcd_client.csr openssl x509 -req -in etcd_client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 3650 -extensions v3_req -extfile etcd_ssl.cnf -out etcd_client.crt
- 编辑etcd的配置文件。注意,各节点的ETCD_NAME和监听地址不一样,ip和证书文件路径要根据实际来修改。以下示例为192.168.3.31的etcd配置
ETCD_NAME=etcd1 ETCD_DATA_DIR=/home/atlas/apps/etcd/data ETCD_CERT_FILE=/home/atlas/apps/etcd/certs/etcd_server.crt ETCD_KEY_FILE=/home/atlas/apps/etcd/certs/etcd_server.key ETCD_TRUSTED_CA_FILE=/home/atlas/apps/kubernetes/certs/ca.crt ETCD_CLIENT_CERT_AUTH=true ETCD_LISTEN_CLIENT_URLS=https://192.168.3.31:2379 ETCD_ADVERTISE_CLIENT_URLS=https://192.168.3.31:2379 ETCD_PEER_CERT_FILE=/home/atlas/apps/etcd/certs/etcd_server.crt ETCD_PEER_KEY_FILE=/home/atlas/apps/etcd/certs/etcd_server.key ETCD_PEER_TRUSTED_CA_FILE=/home/atlas/apps/kubernetes/certs/ca.crt ETCD_LISTEN_PEER_URLS=https://192.168.3.31:2380 ETCD_INITIAL_ADVERTISE_PEER_URLS=https://192.168.3.31:2380 ETCD_INITIAL_CLUSTER_TOKEN=etcd-cluster ETCD_INITIAL_CLUSTER="etcd1=https://192.168.3.31:2380,etcd2=https://192.168.3.32:2380,etcd3=https://192.168.3.33:2380" ETCD_INITIAL_CLUSTER_STATE=new
- 编辑
/etc/systemd/system/etcd.service
,注意根据实际修改配置文件和etcd二进制文件的路径
[Unit] Description=etcd key-value store Documentation=https://github.com/etcd-io/etcd After=network.target [Service] EnvironmentFile=/home/atlas/apps/etcd/conf/etcd.conf ExecStart=/usr/local/bin/etcd Restart=always [Install] WantedBy=multi-user.target
- 加载systemd配置,启动etcd
systemctl daemon-reload systemctl start etcd
- 验证集群是否部署成功。注意根据实际修改证书文件路径和etcd节点的IP与端口
etcdctl --cacert=/etc/kubernetes/pki/ca.crt --cert=/home/atlas/apps/etcd/certs/etcd_client.crt --key=/home/atlas/apps/etcd/certs/etcd_client.key --endpoints=https://192.168.3.31:2379,https://192.168.3.32:2379,https://192.168.3.33:2379 endpoint health
如果集群部署成功,应该有如下类似输出
https://192.168.3.33:2379 is healthy: successfully committed proposal: took = 27.841376ms https://192.168.3.32:2379 is healthy: successfully committed proposal: took = 29.489289ms https://192.168.3.31:2379 is healthy: successfully committed proposal: took = 35.703538ms
6. 部署k8s
k8s的二进制文件安装包可以从github下载:https://github.com/kubernetes/kubernetes/releases
在changelog中找到二进制包的下载链接,下载server binary即可,里面包含了master和node的二进制文件。
解压后将其中的二进制文件挪到 /usr/local/bin
6.1 安装apiserver
- 编辑master_ssl.cnf。DNS.5 ~ DNS.7为三台服务器的主机名,另行设置
/etc/hosts
。IP.1为Master Service虚拟服务的Cluster IP地址,IP.2 ~ IP.4为apiserver的服务器IP
[req] req_extensions = v3_req distinguished_name = req_distinguished_name [req_distinguished_name] [ v3_req ] basicConstraints = CA:FALSE keyUsage = nonRepudiation, digitalSignature, keyEncipherment subjectAltName = @alt_names [alt_names] DNS.1 = kubernetes DNS.2 = kubernetes.default DNS.3 = kubernetes.default.svc DNS.4 = kubernetes.default.svc.cluster.local DNS.5 = k8s31 DNS.6 = k8s32 DNS.7 = k8s33 IP.1 = 169.169.0.1 IP.2 = 192.168.3.31 IP.3 = 192.168.3.32 IP.4 = 192.168.3.33
- 生成证书文件
openssl genrsa -out apiserver.key 2048 openssl req -new -key apiserver.key -config master_ssl.cnf -subj "/CN=192.168.3.31" -out apiserver.csr # ca.crt和ca.key是 "2. openssl生成证书"中的两个证书文件 openssl x509 -req -in apiserver.csr -CA ca.crt -CAkey ca.key -CAcreateserial -days 36500 -extensions v3_req -extfile master_ssl.cnf -out apiserver.crt
- 使用cfssl创建sa.pub和sa-key.pem。cfssl和cfssljson可以从github下载
cat<<EOF > sa-csr.json { "CN":"sa", "key":{ "algo":"rsa", "size":2048 }, "names":[ { "C":"CN", "L":"BeiJing", "ST":"BeiJing", "O":"k8s", "OU":"System" } ] } EOF # cfssl和cfssljson可自行在GitHub搜索下载 cfssl gencert -initca sa-csr.json | cfssljson -bare sa - openssl x509 -in sa.pem -pubkey -noout > sa.pub
- 编辑kube-apiserver的配置文件,注意根据实际情况修改文件路径和etcd地址
KUBE_API_ARGS="--secure-port=6443 \ --tls-cert-file=/home/atlas/apps/kubernetes/apiserver/certs/apiserver.crt \ --tls-private-key-file=/home/atlas/apps/kubernetes/apiserver/certs/apiserver.key \ --client-ca-file=/home/atlas/apps/kubernetes/certs/ca.crt \ --service-account-issuer=https://kubernetes.default.svc.cluster.local \ --service-account-key-file=/home/atlas/apps/kubernetes/certs/sa.pub \ --service-account-signing-key-file=/home/atlas/apps/kubernetes/certs/sa-key.pem \ --apiserver-count=3 --endpoint-reconciler-type=master-count \ --etcd-servers=https://192.168.3.31:2379,https://192.168.3.32:2379,https://192.168.3.33:2379 \ --etcd-cafile=/home/atlas/apps/kubernetes/certs/ca.crt \ --etcd-certfile=/home/atlas/apps/etcd/certs/etcd_client.crt \ --etcd-keyfile=/home/atlas/apps/etcd/certs/etcd_client.key \ --service-cluster-ip-range=169.169.0.0/16 \ --service-node-port-range=30000-32767 \ --allow-privileged=true \ --audit-log-maxsize=100 \ --audit-log-maxage=15 \ --audit-log-path=/home/atlas/apps/kubernetes/apiserver/logs/apiserver.log --v=2"
- 编辑service文件。
/etc/systemd/system/kube-apiserver.service
[Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/home/atlas/apps/kubernetes/apiserver/conf/apiserver ExecStart=/usr/local/bin/kube-apiserver $KUBE_API_ARGS Restart=always [Install] WantedBy=multi-user.target
- 加载service文件,启动kube-apiserver
systemctl daemon-reload systemctl start kube-apiserver
- 生成客户端证书
openssl genrsa -out client.key 2048 # /CN的名称用于标识连接apiserver的客户端用户名称 openssl req -new -key client.key -subj "/CN=admin" -out client.csr openssl x509 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt -days 36500
- 创建客户端连接apiserver所需的kubeconfig配置文件。其中server为nginx监听地址。注意根据实际修改配置
apiVersion: v1 kind: Config clusters: - name: default cluster: server: https://192.168.3.31:9443 certificate-authority: /home/atlas/apps/kubernetes/certs/ca.crt users: - name: admin user: client-certificate: /home/atlas/apps/kubernetes/apiserver/certs/client.crt client-key: /home/atlas/apps/kubernetes/apiserver/certs/client.key contexts: - context: cluster: default user: admin name: default current-context: default
6.2 安装kube-controller-manager
- 编辑配置文件 /home/atlas/apps/kubernetes/controller-manager/conf/env
KUBE_CONTROLLER_MANAGER_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig \ --leader-elect=true \ --service-cluster-ip-range=169.169.0.0/16 \ --service-account-private-key-file=/home/atlas/apps/kubernetes/apiserver/certs/apiserver.key \ --root-ca-file=/home/atlas/apps/kubernetes/certs/ca.crt \ --v=0"
- 编辑service文件
/etc/systemd/system/kube-controller-manager.service
[Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=/home/atlas/apps/kubernetes/controller-manager/conf/env ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_ARGS Restart=always [Install] WantedBy=multi-user.target
- 加载配置文件并启动
systemctl daemon-reload systemctl start kube-controller-manager
6.3 安装kube-scheduler
- 编辑配置文件
KUBE_SCHEDULER_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig \ --leader-elect=true \ --v=0"
- 编辑service文件
/etc/systemd/system/kube-scheduler.service
[Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes [Service] EnvironmentFile=//home/atlas/apps/kubernetes/scheduler/conf/env ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_ARGS Restart=always [Install] WantedBy=multi-user.target
- 启动
systemctl daemon-reload systemctl start kube-scheduler
6.4 安装nginx
这里用nginx对apiserver进行tcp反向代理,也可以使用haproxy。nginx编译安装可参考 博客园 - linux编译安装nginx,docker安装nginx更加简单,本文略过。以下为示例配置:
worker_processes auto; #error_log logs/error.log; #error_log logs/error.log notice; #error_log logs/error.log info; #pid logs/nginx.pid; events { worker_connections 65536; } stream{ log_format json2 '$remote_addr [$time_local] ' '$protocol $status $bytes_sent $bytes_received ' '$session_time "$upstream_addr" ' '"$upstream_bytes_sent" "$upstream_bytes_received" "$upstream_connect_time"'; access_log logs/stream.log json2; upstream apiservers { server 192.168.3.31:6443; server 192.168.3.32:6443; server 192.168.3.33:6443; } server { listen 9443; proxy_pass apiservers; } }
6.5 安装kubelet
- 编辑文件 /home/atlas/apps/kubernetes/kubelet/conf/env。注意修改
hostname-override
中的IP为Node节点自己的IP。如果修改了containerd的socket地址,则配置中也要按实际修改。
KUBELET_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig \ --config=/home/atlas/apps/kubernetes/kubelet/conf/kubelet.config \ --hostname-override=192.168.3.31 \ --v=0 \ --container-runtime-endpoint="unix:///run/containerd/containerd.sock"
主要参数说明
参数 | 说明 |
--kubeconfig | 设置与 apiserver 连接的配置,可以与 controller-manager 的 kubeconfig 相同。新的Node节点注意拷贝客户端相关证书文件,比如ca.crt, client.key, client.crt |
--config | kubelet 配置文件,设置可以让多个Node共享的配置参数。 |
--hostname-override | 本Node在集群中的名称,默认值为主机名 |
--network-plugin | 网络插件类型,推荐使用CNI网络插件 |
- 编辑文件 /home/atlas/apps/kubernetes/kubelet/conf/kubelet.config
kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: 0.0.0.0 port: 10250 cgroupDriver: systemd clusterDNS: ["169.169.0.100"] clusterDomain: cluster.local authentication: anonymous: enabled: true
主要参数说明
参数 | 说明 |
address | 服务监听IP地址 |
port | 服务监听端口号,默认值为10250 |
cgroupDriver | cgroupDriver驱动,默认值为cgroupfs,可选 systemd |
clusterDNS | 集群DNS服务的IP地址 |
clusterDomain | 服务DNS域名后缀 |
authentication | 是否允许匿名访问或者是否使用webhook鉴权 |
- 编辑service文件 /etc/systemd/system/kubelet.service
[Unit] Description=Kubernetes Kubelet Server Documentation=https://github.com/kubernetes/kubernetes After=docker.target [Service] EnvironmentFile=/home/atlas/apps/kubernetes/kubelet/conf/env ExecStart=/usr/local/bin/kubelet $KUBELET_ARGS Restart=always [Install] WantedBy=multi-user.target
- 加载service并启动kubelet
systemctl daemon-reload && systemctl start kubelet
6.6 安装kube-proxy
- 编辑文件 /home/atlas/apps/kubernetes/proxy/conf/env。注意修改
hostname-override
中的IP为Node节点自己的IP。
KUBE_PROXY_ARGS="--kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig \ --hostname-override=192.168.3.31 \ --proxy-mode=ipvs \ --v=0"
- 编辑service文件 /etc/systemd/system/kube-proxy.service
[Unit] Description=Kubernetes Kube-Proxy Server Documentation=https://github.com/kubernetes/kubernetes After=network.target [Service] EnvironmentFile=/home/atlas/apps/kubernetes/proxy/conf/env ExecStart=/usr/local/bin/kube-proxy $KUBE_PROXY_ARGS Restart=always [Install] WantedBy=multi-user.target
- 加载service并启动
systemctl daemon-reload && systemctl start kube-proxy
6.7 安装calico
- 在master节点通过kubectl查询自动注册到 k8s 的 node 信息。由于 Master 开启了 https 认证,所以 kubectl 也需要使用客户端 CA证书连接Master,可以直接使用 apiserver 的 kubeconfig 文件。
kubectl --kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig get nodes
若是不想每次敲命令都要指定kubeconfig文件,可以编辑~/.bashrc
,增加如下内容后source ~/.bashrc
alias kubectl='/usr/local/bin/kubectl --kubeconfig=/home/atlas/apps/kubernetes/apiserver/conf/kubeconfig'
如果操作步骤和以上保持一致,命令执行应有类似如下输出。
NAME STATUS ROLES AGE VERSION 192.168.3.31 Ready <none> 18m v1.26.6 192.168.3.32 Ready <none> 16m v1.26.6 192.168.3.33 Ready <none> 16m v1.26.6
由于安装containerd时,安装包里已经包含了cni插件,所以节点状态都是ready。自测节点之间通信还是有问题,所以换成相对来说熟悉点的calico。
- 下载calico文件。
wget https://docs.projectcalico.org/manifests/calico.yaml
- 编辑calico.yaml文件。因为内网离线部署,以提前在公网下载好calico的镜像并推送到内网的harbor镜像仓库,所以配置文件中的镜像修改成了内网的镜像。
image: registry.atlas.cn/calico/cni:v3.26.1 image: registry.atlas.cn/calico/node:v3.26.1 image: registry.atlas.cn/calico/kube-controllers:v3.26.1
- 部署calico。PS:此处的kubectl命令以alias kubeconfig
kubectl apply -f calico.yaml
- 查看calico的pod是否正常运行。如果正常,状态应该都是running;若不正常,则需要describe pod的信息查看什么问题
kubectl get pods -A
6.8 集群内安装CoreDNS
- 编辑部署文件 coredns.yaml。其中corefile里面的forward地址是内网的dns服务器地址,若内网没有dns,可修改为
/etc/resolv.conf
。coredns的镜像地址也是提前推送的harbor的镜像。
--- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists data: Corefile: | cluster.local { errors health { lameduck 5s } ready kubernetes cluster.local 169.169.0.0/16 { fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . 192.168.3.41 cache 30 loop reload loadbalance } . { cache 30 loadbalance forward . 192.168.3.41 } --- apiVersion: apps/v1 kind: Deployment metadata: name: coredns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/name: "CoreDNS" spec: replicas: 1 strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 selector: matchLabels: k8s-app: kube-dns template: metadata: labels: k8s-app: kube-dns spec: priorityClassName: system-cluster-critical tolerations: - key: "CriticalAddonsOnly" operator: "Exists" nodeSelector: kubernetes.io/os: linux affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: k8s-app operator: In values: ["kube-dns"] topologyKey: kubernetes.io/hostname imagePullSecrets: - name: registry-harbor containers: - name: coredns image: registry.atlas.cn/public/coredns:1.11.1 imagePullPolicy: IfNotPresent resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi args: [ "-conf", "/etc/coredns/Corefile" ] volumeMounts: - name: config-volume mountPath: /etc/coredns readOnly: true ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /ready port: 8181 scheme: HTTP dnsPolicy: Default volumes: - name: config-volume configMap: name: coredns items: - key: Corefile path: Corefile --- apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" kubernetes.io/name: "CoreDNS" spec: selector: k8s-app: kube-dns clusterIP: 169.169.0.100 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP - name: metrics port: 9153 protocol: TCP
- 部署一个nginx用于测试。注意按实际修改镜像地址
--- apiVersion: apps/v1 kind: Deployment metadata: name: deploy-nginx spec: replicas: 2 selector: matchLabels: app: nginx env: dev template: metadata: labels: app: nginx env: dev spec: containers: - name: nginx image: registry.atlas.cn/public/nginx:1.25.1 ports: - containerPort: 80 --- apiVersion: v1 kind: Service metadata: name: svc-nginx spec: ports: - protocol: TCP port: 80 targetPort: 80 selector: app: nginx env: dev
发布nginx服务
kubectl create -f nginx.yaml
- 运行一个ubuntu的pod。镜像基于原版的ubuntu:22.04修改,提前安装了dnsutils再封装推送到内网harbor,Dockerfile内容如下:
FROM ubuntu:22.04 RUN apt update -y && apt install -y dnsutils iputils-ping curl RUN apt clean && rm -rf /var/lib/apt/lists/*
使用文件声明pod。注意按实际修改镜像地址
apiVersion: v1 kind: Pod metadata: name: ubuntu namespace: default spec: containers: - name: ubuntu image: registry.atlas.cn/public/ubuntu:22.04.1 command: - tail - -f - /dev/null
发布pod
kubectl create -f ubuntu.yaml
使用exec选项进入pod内
kubectl exec -it ubuntu -- bash
在pod内测试能否连通nginx。若一切响应正常,说明集群已基本搭建成功
# 测试能否解析出svc-nginx的ip nslookup svc-nginx # 测试能否调通svc-nginx:80 curl http://svc-nginx
补充
以上步骤只是部署了一个能正常发布服务的基础k8s集群,生产环境中还要考虑存储、网络、安全等问题,相关内容比较多,本文不再赘述,可参考其它文档。
问题记录
sysctl加载配置时报错
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-ip6tables: No such file or directory
sysctl: cannot stat /proc/sys/net/bridge/bridge-nf-call-iptables: No such file or directory
处理:装载内核模块
modprobe br_netfilter