什么是argo rollouts
Argo-Rollout是一个Kubernetes Controller和对应一系列的CRD,提供更强大的Deployment能力。包括灰度发布、蓝绿部署、更新测试(experimentation)、渐进式交付(progressive delivery)等特性。
支持特性如下:
- 蓝绿色更新策略
- 金丝雀更新策略
- 细粒度,加权流量转移
- 自动回rollback和promotion
- 手动判断
- 可定制的指标查询和业务KPI分析
- 入口控制器集成:NGINX,ALB
- 服务网格集成:Istio,Linkerd,SMI
- Metric provider集成:Prometheus,Wavefront,Kayenta,Web,Kubernetes Jobs
Argo原理和Deployment差不多,只是加强rollout的策略和流量控制。当spec.template发送变化时,Argo-Rollout就会根据spec.strategy进行rollout,通常会产生一个新的ReplicaSet,逐步scale down之前的ReplicaSet的pod数量。
安装
按官方文档进行安装,官方地址为:https://argoproj.github.io/argo-rollouts/installation/#kubectl-plugin-installation
(1)在Kubernetes集群中安装argo-rollouts
kubectl create namespace argo-rollouts kubectl apply -n argo-rollouts -f https://raw.githubusercontent.com/argoproj/argo-rollouts/stable/manifests/install.yaml
(2)安装argo-rollouts的kubectl plugin
curl -LO https://github.com/argoproj/argo-rollouts/releases/latest/download/kubectl-argo-rollouts-linux-amd64 chmod +x ./kubectl-argo-rollouts-linux-amd64 mv ./kubectl-argo-rollouts-linux-amd64 /usr/local/bin/kubectl-argo-rollouts
金丝雀发布
金丝雀发布包含Replica Shifting和Traffic Shifting两个过程。
- Replica Shifting:版本替换
- Traffic Shifting:流量接入
这里使用官方的demo来进行测试。例子:https://argoproj.github.io/argo-rollouts/getting-started/
Replica Shifting
部署应用
使用如下命令部署示例:
kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/rollout.yaml kubectl apply -f https://raw.githubusercontent.com/argoproj/argo-rollouts/master/docs/getting-started/basic/service.yaml
我们先看看第一个rollout.yaml的具体内容,如下:
apiVersion: argoproj.io/v1alpha1 kind: Rollout metadata: name: rollouts-demo spec: replicas: 5 strategy: canary: steps: - setWeight: 20 - pause: {} - setWeight: 40 - pause: {duration: 10} - setWeight: 60 - pause: {duration: 10} - setWeight: 80 - pause: {duration: 10} revisionHistoryLimit: 2 selector: matchLabels: app: rollouts-demo template: metadata: labels: app: rollouts-demo spec: containers: - name: rollouts-demo image: argoproj/rollouts-demo:blue ports: - name: http containerPort: 8080 protocol: TCP resources: requests: memory: 32Mi cpu: 5m
可以看到除了apiVersion
,kind
以及strategy
之外,其他和Deployment无异。
strategy
字段定义的是发布策略,其中:
- setWeight:设置流量的权重
- pause:暂停,如果里面没有跟
duration: 10
则表示需要手动更新,如果跟了表示等待多长时间会自动更新。
而service.yaml文件定义的就是普通的service,如下:
apiVersion: v1 kind: Service metadata: name: rollouts-demo spec: ports: - port: 80 targetPort: http protocol: TCP name: http selector: app: rollouts-demo
执行上面命令部署后,会在default
命名空间下创建5个pod,如下:
# kubectl get pod NAME READY STATUS RESTARTS AGE nfs-client-prosioner-598d477ff6-fmgwf 1/1 Running 2 17d rollouts-demo-7bf84f9696-4glv6 1/1 Running 0 78s rollouts-demo-7bf84f9696-7kqt6 1/1 Running 0 78s rollouts-demo-7bf84f9696-8k9hw 1/1 Running 0 78s rollouts-demo-7bf84f9696-9cz2r 1/1 Running 0 78s rollouts-demo-7bf84f9696-jvzvd 1/1 Running 0 78s
可以使用kubectl-argo-rollouts get rollout rollouts-demo
命令来查看部署状态,如下:
# kubectl-argo-rollouts get rollout rollouts-demo Name: rollouts-demo Namespace: default Status: ✔ Healthy Strategy: Canary Step: 8/8 SetWeight: 100 ActualWeight: 100 Images: argoproj/rollouts-demo:blue (stable) Replicas: Desired: 5 Current: 5 Updated: 5 Ready: 5 Available: 5 NAME KIND STATUS AGE INFO ⟳ rollouts-demo Rollout ✔ Healthy 114s └──# revision:1 └──⧉ rollouts-demo-7bf84f9696 ReplicaSet ✔ Healthy 114s stable ├──□ rollouts-demo-7bf84f9696-4glv6 Pod ✔ Running 114s ready:1/1 ├──□ rollouts-demo-7bf84f9696-7kqt6 Pod ✔ Running 114s ready:1/1 ├──□ rollouts-demo-7bf84f9696-8k9hw Pod ✔ Running 114s ready:1/1 ├──□ rollouts-demo-7bf84f9696-9cz2r Pod ✔ Running 114s ready:1/1 └──□ rollouts-demo-7bf84f9696-jvzvd Pod ✔ Running 114s ready:1/1
可以看到该版本被标记为stable
,而且STATUS为healthy
。还可以在命令后面加一个--watch
来实时监控服务状态,完整命令为kubectl argo rollouts get rollout rollouts-demo --watch
。
更新应用
接下来对应用进行更新。对应用进行更新和更新用Deployment部署的应用一样,更新镜像即可。argo rollouts插件有一个set image
命令来更新镜像,如下:
kubectl argo rollouts set image rollouts-demo \ rollouts-demo=argoproj/rollouts-demo:yellow
更新过后,我们可以通过观察kubectl argo rollouts get rollout rollouts-demo --watch
服务状态,如下:
Name: rollouts-demo Namespace: default Status: ॥ Paused Message: CanaryPauseStep Strategy: Canary Step: 1/8 SetWeight: 20 ActualWeight: 20 Images: argoproj/rollouts-demo:blue (stable) argoproj/rollouts-demo:yellow (canary) Replicas: Desired: 5 Current: 5 Updated: 1 Ready: 5 Available: 5 NAME KIND STATUS AGE INFO ⟳ rollouts-demo Rollout ॥ Paused 9m12s ├──# revision:2 │ └──⧉ rollouts-demo-789746c88d ReplicaSet ✔ Healthy 44s canary │ └──□ rollouts-demo-789746c88d-l4gmd Pod ✔ Running 44s ready:1/1 └──# revision:1 └──⧉ rollouts-demo-7bf84f9696 ReplicaSet ✔ Healthy 9m12s stable ├──□ rollouts-demo-7bf84f9696-4glv6 Pod ✔ Running 9m12s ready:1/1 ├──□ rollouts-demo-7bf84f9696-8k9hw Pod ✔ Running 9m12s ready:1/1 ├──□ rollouts-demo-7bf84f9696-9cz2r Pod ✔ Running 9m12s ready:1/1 └──□ rollouts-demo-7bf84f9696-jvzvd Pod ✔ Running 9m12s ready:1/1
可以看到多了一个revision:2
,而且该版本被标记为canary
,而且状态是Status: Paused
,canary接入流量为20%。
部署之所以处于Paused
阶段,是因为我们在rollout.yaml中定义了发布第一个版本后会暂停,这时候需要手动接入接下来的更新。
argo rollouts提供了promote
来进行后续的更新,命令如下:
kubectl argo rollouts promote rollouts-demo
然后我们可以在watch界面,看到如下的更新过程。
Name: rollouts-demo Namespace: default Status: ॥ Paused Message: CanaryPauseStep Strategy: Canary Step: 3/8 SetWeight: 40 ActualWeight: 40 Images: argoproj/rollouts-demo:blue (stable) argoproj/rollouts-demo:yellow (canary) Replicas: Desired: 5 Current: 5 Updated: 2 Ready: 5 Available: 5 NAME KIND STATUS AGE INFO ⟳ rollouts-demo Rollout ॥ Paused 15m ├──# revision:2 │ └──⧉ rollouts-demo-789746c88d ReplicaSet ✔ Healthy 6m46s canary │ ├──□ rollouts-demo-789746c88d-l4gmd Pod ✔ Running 6m46s ready:1/1 │ └──□ rollouts-demo-789746c88d-67dwp Pod ✔ Running 19s ready:1/1 └──# revision:1 └──⧉ rollouts-demo-7bf84f9696 ReplicaSet ✔ Healthy 15m stable ├──□ rollouts-demo-7bf84f9696-4glv6 Pod ✔ Running 15m ready:1/1 ├──□ rollouts-demo-7bf84f9696-8k9hw Pod ✔ Running 15m ready:1/1 ├──□ rollouts-demo-7bf84f9696-9cz2r Pod ✔ Running 15m ready:1/1 └──□ rollouts-demo-7bf84f9696-jvzvd Pod ◌ Terminating 15m ready:1/1
因为后续的更新在pause阶段只暂停10s,所以会依次自动更新完,不需要手动介入,待更新完后整体的状态如下:
Name: rollouts-demo Namespace: default Status: ✔ Healthy Strategy: Canary Step: 8/8 SetWeight: 100 ActualWeight: 100 Images: argoproj/rollouts-demo:yellow (stable) Replicas: Desired: 5 Current: 5 Updated: 5 Ready: 5 Available: 5 NAME KIND STATUS AGE INFO ⟳ rollouts-demo Rollout ✔ Healthy 17m ├──# revision:2 │ └──⧉ rollouts-demo-789746c88d ReplicaSet ✔ Healthy 8m35s stable │ ├──□ rollouts-demo-789746c88d-l4gmd Pod ✔ Running 8m35s ready:1/1 │ ├──□ rollouts-demo-789746c88d-67dwp Pod ✔ Running 2m8s ready:1/1 │ ├──□ rollouts-demo-789746c88d-k7mfk Pod ✔ Running 106s ready:1/1 │ ├──□ rollouts-demo-789746c88d-glbfb Pod ✔ Running 94s ready:1/1 │ └──□ rollouts-demo-789746c88d-d7m4f Pod ✔ Running 83s ready:1/1 └──# revision:1 └──⧉ rollouts-demo-7bf84f9696 ReplicaSet • ScaledDown 17m
可以看到第一个版本已经下线,第二个版本的状态为Healthy
,而且镜像被标记为stable
。
终止更新
如果在更新应用的过程中,最新的应用有问题,需要终止更新需要怎么做呢?
我们先使用下面命令发布新版本应用,如下:
kubectl argo rollouts set image rollouts-demo \ rollouts-demo=argoproj/rollouts-demo:red
然后更新动作会在第一次更新的时候处于Paused
状态,现在我们可以用abort
来终止发布,如下:
kubectl argo rollouts abort rollouts-demo
待执行完命令后,可以在watch页面,看到如下信息:
Name: rollouts-demo Namespace: default Status: ✖ Degraded Message: RolloutAborted: Rollout is aborted Strategy: Canary Step: 0/8 SetWeight: 0 ActualWeight: 0 Images: argoproj/rollouts-demo:yellow (stable) Replicas: Desired: 5 Current: 5 Updated: 0 Ready: 5 Available: 5 NAME KIND STATUS AGE INFO ⟳ rollouts-demo Rollout ✖ Degraded 21m ├──# revision:3 │ └──⧉ rollouts-demo-6f75f48b7 ReplicaSet • ScaledDown 90s canary ├──# revision:2 │ └──⧉ rollouts-demo-789746c88d ReplicaSet ✔ Healthy 13m stable │ ├──□ rollouts-demo-789746c88d-l4gmd Pod ✔ Running 13m ready:1/1 │ ├──□ rollouts-demo-789746c88d-67dwp Pod ✔ Running 6m46s ready:1/1 │ ├──□ rollouts-demo-789746c88d-k7mfk Pod ✔ Running 6m24s ready:1/1 │ ├──□ rollouts-demo-789746c88d-glbfb Pod ✔ Running 6m12s ready:1/1 │ └──□ rollouts-demo-789746c88d-nntc9 Pod ✔ Running 18s ready:1/1 └──# revision:1 └──⧉ rollouts-demo-7bf84f9696 ReplicaSet • ScaledDown 21m
最终应用会回退到稳定版本。
但是我们可以看到Status是Degraded
状态而并非Healthy
状态,我们有必须要将其变成Healthy
状态。最简单的办法就是执行如下命令重新发布一下版本:
kubectl argo rollouts set image rollouts-demo \ rollouts-demo=argoproj/rollouts-demo:yellow
执行过后,可以看到其状态立即变成Healthy,并且没有创建新的副本、新的版本,如下:
Name: rollouts-demo Namespace: default Status: ✔ Healthy Strategy: Canary Step: 8/8 SetWeight: 100 ActualWeight: 100 Images: argoproj/rollouts-demo:yellow (stable) Replicas: Desired: 5 Current: 5 Updated: 5 Ready: 5 Available: 5 NAME KIND STATUS AGE INFO ⟳ rollouts-demo Rollout ✔ Healthy 40m ├──# revision:4 │ └──⧉ rollouts-demo-789746c88d ReplicaSet ✔ Healthy 32m stable │ ├──□ rollouts-demo-789746c88d-l4gmd Pod ✔ Running 32m ready:1/1 │ ├──□ rollouts-demo-789746c88d-67dwp Pod ✔ Running 26m ready:1/1 │ ├──□ rollouts-demo-789746c88d-k7mfk Pod ✔ Running 25m ready:1/1 │ ├──□ rollouts-demo-789746c88d-glbfb Pod ✔ Running 25m ready:1/1 │ └──□ rollouts-demo-789746c88d-nntc9 Pod ✔ Running 19m ready:1/1 ├──# revision:3 │ └──⧉ rollouts-demo-6f75f48b7 ReplicaSet • ScaledDown 20m └──# revision:1 └──⧉ rollouts-demo-7bf84f9696 ReplicaSet • ScaledDown 40m