前言
在本章中,我们将直接将 Cilium 安装到 Kubernetes 集群中。
在实验中,我们用到的组件及版本为:
- Cilium 1.13.4
- K3s v1.26.6+k3s1
- OS
- Debian 10, Kernel 4.19.232, arm64
- Ubuntu 23.04, Kernel 6.2, x86
📝Notes:
如前文所述,Cilium 对 Linux Kernel 版本要求很高。1.13.4 推荐 Kernel ≥ 5.10(使用最新 LTS 稳定版 Kernel), 最低 Linux Kernel 得是 4.19.57.
所以我们选择 2 个 OS, 一个只满足最低 Kernel 要求,一个是尽可能新的 Kernel, 看看安装和功能有哪些差别。
Cilium 安装方式
Cilium 支持 2 种安装方式:
- Cilium CLI
- Helm chart
CLI 工具能让你轻松上手 Cilium,尤其是在刚开始学习时。它直接使用 Kubernetes API 来检查与现有 kubectl 上下文相对应的集群,并为检测到的 Kubernetes 实施 选择合适的安装选项。
Helm Chart 方法适用于需要对 Cilium 安装进行 精细控制的高级安装和生产环境。它要求你为特定的 Kubernetes 环境手动选择最佳数据路径 (datapath) 和 IPAM 模式。
系统需求
要安装 Cilium, 最低系统需求如下:
- 主机是 AMD64 或 AArch64(即 arm64) 架构
- 需要开通某些防火墙规则,详见:https://docs.cilium.io/en/stable/operations/system_requirements/#firewall-rules
- Linux Kernel >= 4.19.57(对于 RHEL8, Linux Kernel >= 4.18)
- Linux Kernel 需要启用相关配置,具体可以参考这篇文章:玩转 PI 系列 - 如何在 Rockchip arm 开发板上安装 docker tailscale k3s cilium?
- 如果需要用到 Cilium 的高级功能,则需要更高版本的内核,具体如下:
Cilium 功能 | 最小 Kernel 版本 |
Bandwidth Manager | >= 5.1 |
Egress Gateway | >= 5.2 |
VXLAN Tunnel Endpoint (VTEP) Integration | >= 5.2 |
WireGuard Transparent Encryption | >= 5.6 |
Full support for Session Affinity | >= 5.7 |
BPF-based proxy redirection | >= 5.7 |
Socket-level LB bypass in pod netns | >= 5.7 |
L3 devices | >= 5.8 |
BPF-based host routing | >= 5.10 |
IPv6 BIG TCP support | >= 5.19 |
Cilium 安装
首次安装,我们使用 Cilium CLI 方式安装。OS 为:Debian 10, Kernel 4.19 4.19.232, arm64
安装 K3s
我们通过 K3s 安装 Kubernetes 集群。具体命令如下:
# Server Node curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | INSTALL_K3S_MIRROR=cn INSTALL_K3S_EXEC='--write-kubeconfig-mode=644 --flannel-backend=none --disable-network-policy --prefer-bundled-bin' INSTALL_K3S_VERSION=v1.26.6+k3s1 sh - BASH |
📝Notes:
几个主流 Linux 发行版发布的 iptables 版本包含一个错误,该错误会导致重复规则的累积,从而对节点的性能和稳定性产生负面影响。有关如何确定你是否受此问题影响,请参阅 issue #3117。
K3s 具有一个可以正常运行的 iptables (v1.8.8) 版本。你可以通过使用
--prefer-bundled-bin
选项来启动 K3s,或从操作系统中卸载 iptables/nftables 包,从而让 K3s 使用捆绑的 iptables 版本。版本
--prefer-bundled-bin
标志从 2022-12 版本开始可用(v1.26.0+k3s1、v1.25.5+k3s1、v1.24.9+k3s1、v1.23.15+k3s1)。
验证:
$ systemctl status k3s ● k3s.service - Lightweight Kubernetes Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled) Active: active (running) BASH |
$ k3s kubectl get node NAME STATUS ROLES AGE VERSION linaro-alip NotReady control-plane,master 3d1h v1.26.6+k3s1 BASH |
🐾注意,由于没有安装 flannel, 也还没开始安装 Cilium, 所以 node 状态应为:NotReady
.
安装 Cilium CLI
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/master/stable.txt) CLI_ARCH=amd64 if ["$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum} sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum} BASH |
验证:
$ cilium version cilium-cli: v0.15.2 compiled with go1.20.4 on linux/arm64 cilium image (default): v1.13.4 cilium image (stable): v1.13.4 cilium image (running): 1.13.4 BASH |
Cilium Install
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml cilium install BASH |
通过该命令,cilium 会自动进行一些环境信息的识别,以及参数的选择和判断:
🔮 Auto-detected Kubernetes kind: k3s ✨ Running "k3s" validation checks ✅ Detected k3s version "v1.26.6+k3s1" ℹ️ Using Cilium version 1.13.4 🔮 Auto-detected cluster name: default 🔮 Auto-detected datapath mode: tunnel 🔮 Auto-detected kube-proxy has been installed ℹ️ helm template --namespace kube-system cilium cilium/cilium --version 1.13.4 --set cluster.id=0,cluster.name=default,encryption.nodeEncryption=false,ipam.mode=kubernetes,kubeProxyReplacement=disabled,operator.replicas=1,serviceAccounts.cilium.name=cilium,serviceAccounts.operator.name=cilium-operator,tunnel=vxlan ℹ️ Storing helm values file in kube-system/cilium-cli-helm-values Secret 🔑 Created CA in secret cilium-ca 🔑 Generating certificates for Hubble... 🚀 Creating Service accounts... 🚀 Creating Cluster roles... 🚀 Creating ConfigMap for Cilium version 1.13.4... 🚀 Creating Agent DaemonSet... 🚀 Creating Operator Deployment... ⌛ Waiting for Cilium to be installed and ready... VIM |
验证:
$ cilium status --wait /¯¯\ /¯¯\__/¯¯\ Cilium: OK \__/¯¯\__/ Operator: OK /¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode) \__/¯¯\__/ Hubble Relay: disabled \__/ ClusterMesh: disabled DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1 Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1 Containers: cilium Running: 1 cilium-operator Running: 1 Cluster Pods: 7/7 managed by Cilium Helm chart version: 1.13.4 Image versions cilium quay.io/cilium/cilium:v1.13.4@sha256:bde8800d61aaad8b8451b10e247ac7bdeb7af187bb698f83d40ad75a38c1ee6b: 1 cilium-operator quay.io/cilium/operator-generic:v1.13.4@sha256:09ab77d324ef4d31f7d341f97ec5a2a4860910076046d57a2d61494d426c6301: 1 BASH |
运行以下命令验证群集是否具有正确的网络连接:
$ cilium connectivity test --request-timeout 30s --connect-timeout 10s ℹ️ Monitor aggregation detected, will skip some flow validation steps ✨ [k8s-cluster] Creating namespace for connectivity check... (...) --------------------------------------------------------------------------------------------------------------------- 📋 Test Report --------------------------------------------------------------------------------------------------------------------- ✅ 69/69 tests successful (0 warnings) BASH |
🐾Warning:
在中国安装时,由于网络环境所限,可能部分测试会失败(如访问 1.1.1.1:443). 具体见下方示例.
属于正常情况。
连接性测试需要至少 两个 worker node 才能在群集中成功部署。连接性测试 pod 不会在以控制面角色运行的节点上调度。如果您没有为群集配置两个 worker node,连接性测试命令可能会在等待测试环境部署完成时停滞。
示例, 在中国运行测试的真实情况:
📋 Test Report ❌ 7/42 tests failed (17/291 actions), 12 tests skipped, 1 scenarios skipped: Test [no-policies]: ❌ no-policies/pod-to-cidr/external-1111-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> external-1111 (1.1.1.1:443) ❌ no-policies/pod-to-cidr/external-1111-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> external-1111 (1.1.1.1:443) ❌ no-policies/pod-to-world/https-to-one.one.one.one-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> one.one.one.one-https (one.one.one.one:443) ❌ no-policies/pod-to-world/https-to-one.one.one.one-index-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> one.one.one.one-https-index (one.one.one.one:443) ❌ no-policies/pod-to-world/https-to-one.one.one.one-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> one.one.one.one-https (one.one.one.one:443) ❌ no-policies/pod-to-world/https-to-one.one.one.one-index-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> one.one.one.one-https-index (one.one.one.one:443) Test [all-ingress-deny]: ❌ all-ingress-deny/pod-to-cidr/external-1111-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> external-1111 (1.1.1.1:443) ❌ all-ingress-deny/pod-to-cidr/external-1111-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> external-1111 (1.1.1.1:443) Test [all-ingress-deny-knp]: ❌ all-ingress-deny-knp/pod-to-cidr/external-1111-0: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> external-1111 (1.1.1.1:443) ❌ all-ingress-deny-knp/pod-to-cidr/external-1111-1: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> external-1111 (1.1.1.1:443) Test [to-cidr-external]: ❌ to-cidr-external/pod-to-cidr/external-1111-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> external-1111 (1.1.1.1:443) ❌ to-cidr-external/pod-to-cidr/external-1111-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> external-1111 (1.1.1.1:443) Test [to-cidr-external-knp]: ❌ to-cidr-external-knp/pod-to-cidr/external-1111-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> external-1111 (1.1.1.1:443) ❌ to-cidr-external-knp/pod-to-cidr/external-1111-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> external-1111 (1.1.1.1:443) Test [client-egress-to-cidr-deny]: ❌ client-egress-to-cidr-deny/pod-to-cidr/external-1111-0: cilium-test/client2-5c6c769648-mjbdx (10.0.0.237) -> external-1111 (1.1.1.1:443) ❌ client-egress-to-cidr-deny/pod-to-cidr/external-1111-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> external-1111 (1.1.1.1:443) Test [to-fqdns]: ❌ to-fqdns/pod-to-world/http-to-one.one.one.one-1: cilium-test/client-c4bfddc44-j8mbz (10.0.0.212) -> one.one.one.one-http (one.one.one.one:80) connectivity test failed: 7 tests failed SUBUNIT |
查看 Cilium Install 具体启用了哪些功能:
$ kubectl -n kube-system exec ds/cilium -- cilium status Defaulted container "cilium-agent" out of: cilium-agent, config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init) KVStore: Ok Disabled Kubernetes: Ok 1.26 (v1.26.6+k3s1) [linux/arm64] Kubernetes APIs: ["cilium/v2::CiliumClusterwideNetworkPolicy", "cilium/v2::CiliumEndpoint", "cilium/v2::CiliumNetworkPolicy", "cilium/v2::CiliumNode", "core/v1::Namespace", "core/v1::Node", "core/v1::Pods", "core/v1::Service", "discovery/v1::EndpointSlice", "networking.k8s.io/v1::NetworkPolicy"] KubeProxyReplacement: Disabled Host firewall: Disabled CNI Chaining: none CNI Config file: CNI configuration file management disabled Cilium: Ok 1.13.4 (v1.13.4-4061cdfc) NodeMonitor: Listening for events on 4 CPUs with 64x4096 of shared memory Cilium health daemon: Ok IPAM: IPv4: 9/254 allocated from 10.0.0.0/24, IPv6 BIG TCP: Disabled BandwidthManager: Disabled Host Routing: Legacy Masquerading: IPtables Controller Status: 48/48 healthy Proxy Status: No managed proxy redirect Global Identity Range: min 256, max 65535 Hubble: Ok Current/Max Flows: 4095/4095 (100.00%), Flows/s: 11.68 Metrics: Disabled Encryption: Disabled Cluster health: 1/1 reachable (2023-07-19T12:25:40Z) BASH |
这里有几个点注意一下:
datapath mode: tunnel
: 因为兼容性原因,Cilium 会默认启用 tunnel(基于 vxlan) 的 datapatch 模式,也就是 overlay 网络结构。KubeProxyReplacement: Disabled
Cilium 是没有完全替换掉 kube-proxy 的,后面我们会出文章介绍如何实现替换。IPv6 BIG TCP: Disabled
该功能要求 Linux Kernel >= 5.19, 所以在 Kernel 4.19.232 状态为禁用。BandwidthManager: Disabled
该功能要求 Linux Kernel >= 5.1, 所以目前是禁用的Host Routing: Legacy
Legacy Host Routing 还是会用到 iptables, 性能较弱;但是 BPF-based host routing 需要 Linux Kernel >= 5.10Masquerading: IPtables
IP 伪装有几种方式:基于 eBPF 的,和基于 iptables 的。默认使用基于 iptables, 推荐使用 基于 eBPF 的。Hubble Relay: disabled
默认 Hubble 也是禁用的。
Cilium 的最重要的特点就是其性能,所以只要是可以增强性能的,后续会一一介绍如何启用。
安装 Cilium Hubble
$ cilium hubble enable --ui ✨ Patching ConfigMap cilium-config to enable Hubble... 🚀 Creating ConfigMap for Cilium version 1.13.4... ♻️ Restarted Cilium pods ⌛ Waiting for Cilium to become ready before deploying other Hubble component(s)... 🚀 Creating Peer Service... ✨ Generating certificates... 🔑 Generating certificates for Relay... ✨ Deploying Relay... ✨ Deploying Hubble UI and Hubble UI Backend... ⌛ Waiting for Hubble to be installed... ℹ️ Storing helm values file in kube-system/cilium-cli-helm-values Secret ✅ Hubble was successfully enabled! BASH |
验证:
$ cilium status /¯¯\ /¯¯\__/¯¯\ Cilium: OK \__/¯¯\__/ Operator: OK /¯¯\__/¯¯\ Envoy DaemonSet: disabled (using embedded mode) \__/¯¯\__/ Hubble Relay: OK \__/ ClusterMesh: disabled Deployment hubble-ui Desired: 1, Ready: 1/1, Available: 1/1 DaemonSet cilium Desired: 1, Ready: 1/1, Available: 1/1 Deployment cilium-operator Desired: 1, Ready: 1/1, Available: 1/1 Deployment hubble-relay Desired: 1, Ready: 1/1, Available: 1/1 Containers: hubble-ui Running: 1 cilium Running: 1 cilium-operator Running: 1 hubble-relay Running: 1 Cluster Pods: 9/9 managed by Cilium Helm chart version: 1.13.4 Image versions cilium quay.io/cilium/cilium:v1.13.4@sha256:bde8800d61aaad8b8451b10e247ac7bdeb7af187bb698f83d40ad75a38c1ee6b: 1 cilium-operator quay.io/cilium/operator-generic:v1.13.4@sha256:09ab77d324ef4d31f7d341f97ec5a2a4860910076046d57a2d61494d426c6301: 1 hubble-relay quay.io/cilium/hubble-relay:v1.13.4@sha256:bac057a5130cf75adf5bc363292b1f2642c0c460ac9ff018fcae3daf64873871: 1 hubble-ui quay.io/cilium/hubble-ui:v0.11.0@sha256:bcb369c47cada2d4257d63d3749f7f87c91dde32e010b223597306de95d1ecc8: 1 hubble-ui quay.io/cilium/hubble-ui-backend:v0.11.0@sha256:14c04d11f78da5c363f88592abae8d2ecee3cbe009f443ef11df6ac5f692d839: 1 BASH |
使用 Kubectl 检查集群状态
$ kubectl get nodes NAME STATUS ROLES AGE VERSION linaro-alip Ready control-plane,master 3d1h v1.26.6+k3s1 BASH |
$ kubectl get daemonsets --all-namespaces NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE kube-system cilium 1 1 1 1 1 kubernetes.io/os=linux 28h kube-system svclb-traefik-29b9c193 1 1 1 1 1 <none> 2d23h BASH |
$ kubectl get deployments --all-namespaces NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE kube-system local-path-provisioner 1/1 1 1 3d1h default my-nginx 2/2 2 2 32h kube-system coredns 1/1 1 1 3d1h kube-system traefik 1/1 1 1 2d23h kube-system metrics-server 1/1 1 1 3d1h kube-system cilium-operator 1/1 1 1 28h kube-system hubble-relay 1/1 1 1 5m39s kube-system hubble-ui 1/1 1 1 5m38s BASH |
你应该会发现 cilium daemonset 正在集群的所有节点上运行,而 cilium-operator 部署正在单个节点上运行。
恭喜你!🎉🎉🎉你现在已经安装了 Cilium,为 Kubernetes 集群提供连接。
总结
本文我们主要介绍了 Cilium 的快速安装过程。
要安装 Cilium, 需要满足一些基本需求,其中 Cilium 对 Linux Kernel 版本的要求较高。
通过 cilium install
, 在 Debian 10, Kernel 4.19.232, arm64 机器上,安装了 K3s v1.26.6+k3s1 和 Cilium 1.13.4, 启用了 Hubble, 并进行了验证。🎉🎉🎉