【阅读原文】戳:结合阿里云ASM泳道与Kruise Rollout进行全链路灰度发布
概述
灰度发布是微服务开发中非常有效且重要的一部分的发布策略。通过合理的流量控制和监测手段,可以大幅降低系统发布过程中的风险,实现快速迭代与高效交付。灰度发布会根据请求内容或者请求流量的比例将线上流量的一小部分转发至服务的新版本,待灰度验证通过后,逐步调大新版本的请求流量,是一种循序渐进的发布方式。相较于其他发布模式,如全量发布、蓝绿发布等,灰度发布具有资源优化,灵活性高等优势。
当服务之间存在调用链路时,对服务的灰度发布往往不局限于单个服务,而是需要对服务的整条请求链路进行环境隔离与流量控制,即保证灰度流量只发往调用链路中服务的灰度版本,实现调用链路之间相互隔离的隔离环境,也即全链路灰度。
ASM提供了泳道功能,对不同版本的流量进行隔离,同时支持定义基线版本,可以在调用其他版本服务失败时自动fallback到基线版本服务,实现更加灵活、稳定的灰度流量测试。通过结合渐进式交付框架Kruise Rollout,可以自动化地管理新版本应用的发布状态,实现全生命周期的全链路灰度发布。
ASM介绍
阿里云服务网格(简称ASM)是一个统一管理微服务应用流量、兼容Istio的托管式平台。通过流量控制、网格观测以及服务间通信安全等功能,服务网格ASM可以全方位地简化您的服务治理,并为运行在异构计算基础设施上的服务提供统一的管理能力,适用于Kubernetes集群、Serverless Kubernetes集群、ECS虚拟机以及自建集群。ASM可以为混合云、多云、多集群等核心场景,构建托管式统一的服务网格能力。
ASM支持将应用的相关版本(或者其他特征)隔离成一个独立的运行环境(即泳道),然后通过设置泳道规则,将满足规则的请求流量路由到目标版本(或者其他特征)的应用上。搭配定制化路由资源如虚拟服务、目标规则等,可以实现全链路的流量统一接入、细粒度的路由管理以及插件式的流量处理。
Kruise Rollout介绍
Kruise Rollout是OpenKruise社区开源的渐进式交付框架。Kruise Rollout支持配合流量和实例灰度的灰度发布、蓝绿发布、A/B Testing发布。基于Prometheus Metrics指标,Kruise Rollout还可以实现发布过程的自动化分批与暂停,并提供旁路的无感对接、兼容已有的多种工作负载(Deployment、CloneSet、StatefulSet)。更多信息,请参见[1]Kruise Rollout。
Kruise Rollout是一种旁路式的工作机制。您只需配置一份Rollout资源并将其下发到K8s集群中,后续的业务发布、升级均无需额外操作,并且可以与Helm、PaaS平台低成本地无缝对接。使用Kruise Rollout实现灰度发布架构如下图所示。
最佳实践
- 创建ingressgateway网关规则 -
首先定义流量入口的网关规则,使用以下内容创建ingressgateway且命名空间为istio-system的网关规则。具体操作,请参见[2]管理网关规则。
apiVersion: networking.istio.io/v1beta1 kind: Gateway metadata: name: ingressgateway namespace: istio-system spec: selector: istio: ingressgateway servers: - port: number: 80 name: http protocol: HTTP hosts: - '*'
- 部署基线应用 -
在集群中部署三个应用,分别为mocka,mockb和mockc,三者的访问关系如下。
apiVersion: v1 kind: Service metadata: name: mocka labels: app: mocka service: mocka spec: ports: - port: 8000 name: http selector: app: mocka --- apiVersion: v1 kind: Service metadata: name: mockb labels: app: mockb service: mockb spec: ports: - port: 8000 name: http selector: app: mockb --- apiVersion: v1 kind: Service metadata: name: mockc labels: app: mockc service: mockc spec: ports: - port: 8000 name: http selector: app: mockc --- apiVersion: apps/v1 kind: Deployment metadata: name: mocka-v1 labels: app: mocka version: spec: replicas: 1 selector: matchLabels: app: mocka template: metadata: labels: app: mocka version: v1 ASM_TRAFFIC_TAG: base spec: containers: - name: default image: registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/go-http-sample:1.0 imagePullPolicy: IfNotPresent env: - name: version value: v1 - name: app value: mocka - name: upstream_url value: "http://mockb:8000/" ports: - containerPort: 8000 --- apiVersion: apps/v1 kind: Deployment metadata: name: mockb-v1 labels: app: mockb version: v1 spec: replicas: 1 selector: matchLabels: app: mockb template: metadata: labels: app: mockb version: v1 ASM_TRAFFIC_TAG: base spec: containers: - name: default image: registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/go-http-sample:1.0 imagePullPolicy: IfNotPresent env: - name: version value: v1 - name: app value: mockb - name: upstream_url value: "http://mockc.demo:8000/" ports: - containerPort: 8000 --- apiVersion: apps/v1 kind: Deployment metadata: name: mockc-v1 labels: app: mockc version: v1 spec: replicas: 1 selector: matchLabels: app: mockc template: metadata: labels: app: mockc version: v1 ASM_TRAFFIC_TAG: base spec: containers: - name: default image: registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/go-http-sample:1.0 imagePullPolicy: IfNotPresent env: - name: version value: v1 - name: app value: mockc ports: - containerPort: 8000
- 部署ASM泳道组和泳道 -
部署一个泳道组和两条泳道,泳道组定义了了三个包含的服务分别为mocka、mockb和mockc;两条泳道分别为基线泳道和金丝雀发布泳道,基线泳道中包含标签为ASM_TRAFFIC_TAG: base的应用,金丝雀泳道中包含标签为ASM_TRAFFIC_TAG: canary的应用。
apiVersion: istio.alibabacloud.com/v1 kind: ASMSwimLaneGroup metadata: name: rollout spec: ingress: gateway: name: ingressgateway namespace: istio-system type: ASM ingressRouting: ingressRoutingStrategy: rule_based weightedRoutingRule: hosts: - '*' requestMatches: - uri: exact: /mock isPermissive: true permissiveModeConfiguration: fallbackTarget: base routeHeader: x-asm-prefer-tag traceHeader: my-request-id autoUpdate: true services: - cluster: id: c8f823ca6f5de404486e1b83d61b4e812 name: test name: mockb namespace: default - cluster: id: ce9724f7548914f9bbc0c09bbf0481623 name: test name: mocka namespace: default - cluster: id: ce9724f7548914f9bbc0c09bbf0481623 name: test name: mockc namespace: default --- apiVersion: istio.alibabacloud.com/v1 kind: ASMSwimLane metadata: labels: swimlane-group: rollout name: base spec: ingressRules: - hosts: - '*' match: headers: x-asm-prefer-tag: exact: base uri: exact: /mock name: base online: true route: destination: host: mocka.default.svc.cluster.local ingressWeight: destination: {} labelSelector: ASM_TRAFFIC_TAG: base services: - cluster: id: ce9724f7548914f9bbc0c09bbf0481623 name: test name: mockb namespace: default - cluster: id: ce9724f7548914f9bbc0c09bbf0481623 name: test name: mocka namespace: default - cluster: id: ce9724f7548914f9bbc0c09bbf0481623 name: test name: mockc namespace: default --- apiVersion: istio.alibabacloud.com/v1 kind: ASMSwimLane metadata: labels: swimlane-group: rollout name: canary spec: ingressRules: - hosts: - '*' match: headers: x-asm-prefer-tag: exact: canary uri: exact: /mock name: canary online: true route: destination: host: mocka.default.svc.cluster.local labelSelector: ASM_TRAFFIC_TAG: canary services: []
部署完成后,ASM将根据泳道组和泳道中定义的服务和路由规则自动创建VirtualService和DestinationRule。路由规则的含义为:
• 包含header x-asm-prefer-tag: base的流量全部流向基线泳道
• 包含header x-asm-prefer-tag: canary的流量全部流向金丝雀泳道
• 当金丝雀泳道中服务调用失败时,将流量转向基线泳道
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: swimlane-ingress-vs-rollout-base namespace: istio-system spec: gateways: - istio-system/ingressgateway hosts: - '*' http: - match: - headers: x-asm-prefer-tag: exact: base uri: exact: /mock name: base route: - destination: host: mocka.default.svc.cluster.local subset: base fallback: target: host: mocka.default.svc.cluster.local subset: base headers: request: set: x-asm-prefer-tag: base --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: swimlane-ingress-vs-rollout-canary namespace: istio-system spec: gateways: - istio-system/ingressgateway hosts: - '*' http: - match: - headers: x-asm-prefer-tag: exact: canary uri: exact: /mock name: canary route: - destination: host: mocka.default.svc.cluster.local subset: canary fallback: target: host: mocka.default.svc.cluster.local subset: base headers: request: set: x-asm-prefer-tag: canary --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-vs-rollout-default-mocka namespace: istio-system spec: hosts: - mocka.default.svc.cluster.local http: - name: default route: - destination: host: mocka.default.svc.cluster.local subset: $asm-label fallback: target: host: mocka.default.svc.cluster.local subset: base --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-vs-rollout-default-mockb namespace: istio-system spec: hosts: - mockb.default.svc.cluster.local http: - name: default route: - destination: host: mockb.default.svc.cluster.local subset: $asm-label fallback: target: host: mockb.default.svc.cluster.local subset: base --- apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-vs-rollout-default-mockc namespace: istio-system spec: hosts: - mockc.default.svc.cluster.local http: - name: default route: - destination: host: mockc.default.svc.cluster.local subset: $asm-label fallback: target: host: mockc.default.svc.cluster.local subset: base
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mocka namespace: istio-system spec: host: mocka.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockb namespace: istio-system spec: host: mockb.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockc namespace: istio-system spec: host: mockc.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base
在最初的状态,基线泳道中包含三个已发布的基线应用,而金丝雀泳道中不包含任何服务。此时利用以下命令访问基线泳道可以得到:
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: base' -H 'my-request-id: 9999' % -> mocka(version: v1, ip: 172.16.0.87)-> mockb(version: v1, ip: 172.16.0.97)-> mockc(version: v1, ip: 172.16.0.89)
由于金丝雀泳道中不含任何应用,因此访问金丝雀泳道的请求也会流向基线泳道。
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: canary' -H 'my-request-id: 10000' % -> mocka(version: v1, ip: 172.16.0.87)-> mockb(version: v1, ip: 172.16.0.97)-> mockc(version: v1, ip: 172.16.0.89)
- 部署Kruise Rollout -
在集群中为mocka、mockb和mockc应用分别部署Kruise Rollout,定义如下发布策略:
• 发布过程中创建canary deployment以部署新版本应用
• 发布分一批次完成,第一批次部署一个新版本应用
• 金丝雀版本pod label包含ASM_TRAFFIC_TAG: canary
apiVersion: rollouts.kruise.io/v1beta1 kind: Rollout metadata: name: rollouts-mocka spec: workloadRef: apiVersion: apps/v1 kind: Deployment name: mocka-v1 strategy: canary: # 创建canary版本deployment enableExtraWorkloadForCanary: true # 一共一个批次,第一批次部署一个应用 steps: - replicas: 1 # 修改pod的ASM_TRAFFIC_TAG label为canary patchPodTemplateMetadata: labels: ASM_TRAFFIC_TAG: canary --- apiVersion: rollouts.kruise.io/v1beta1 kind: Rollout metadata: name: rollouts-mockb spec: workloadRef: apiVersion: apps/v1 kind: Deployment name: mockb-v1 strategy: canary: enableExtraWorkloadForCanary: true steps: - replicas: 1 patchPodTemplateMetadata: labels: ASM_TRAFFIC_TAG: canary --- apiVersion: rollouts.kruise.io/v1beta1 kind: Rollout metadata: name: rollouts-mockc namespace: demo spec: workloadRef: apiVersion: apps/v1 kind: Deployment name: mockc-v1 strategy: canary: enableExtraWorkloadForCanary: true steps: - replicas: 1 patchPodTemplateMetadata: labels: ASM_TRAFFIC_TAG: canary
- 发布新版本应用 -
发布mocka
运行如下命令,修改基线应用镜像,开始新版本发布:
kubectl patch deployment mocka-v1\ -p '{"spec": {"template": {"spec": {"containers": [{"name": "default", "image": "registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/go-http-sample:2.0"}]}}}}}'
Kruise Rollout自动创建金丝雀版本deployment,在金丝雀泳道中部署新版本pod。此时运行以下命令查看DestinationRule信息:
$ kubectl get destinationrule trafficlabel-dr-rollout-default-mocka -n istio-system
可以发现其中添加了新版本服务:
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mocka namespace: istio-system spec: host: mocka.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base # 更新了mocka新版本的信息 - labels: ASM_TRAFFIC_TAG: canary name: canary --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockb namespace: istio-system spec: host: mockb.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockc namespace: istio-system spec: host: mockc.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base
此时利用以下命令访问金丝雀泳道可以得到:
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: base' -H 'my-request-id: 10001' % -> mocka(version: v2, ip: 172.16.0.88)-> mockb(version: v1, ip: 172.16.0.97)-> mockc(version: v1, ip: 172.16.0.89)
由于当前只有mocka服务存在金丝雀版本,因此mockb和mockc的流量均流向基线泳道。
发布mockb和mockc
运行如下命令,修改mockb和mockc的基线应用镜像,开始新版本发布:
$ kubectl patch deployment mockb-v1 \ -p '{"spec": {"template": {"spec": {"containers": [{"name": "default", "image": "registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/go-http-sample:2.0"}]}}}}}' $ kubectl patch deployment mockc-v1 \ -p '{"spec": {"template": {"spec": {"containers": [{"name": "default", "image": "registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/go-http-sample:2.0"}]}}}}}'
此时运行以下命令查看DestinationRule信息:
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mocka namespace: istio-system spec: host: mocka.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base - labels: ASM_TRAFFIC_TAG: canary name: canary --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockb namespace: istio-system spec: host: mockb.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base - labels: ASM_TRAFFIC_TAG: canary name: canary --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockc namespace: istio-system spec: host: mockc.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base - labels: ASM_TRAFFIC_TAG: canary name: canary
可以看出,mockb和mockc中添加了canary版本信息。此时利用以下命令访问金丝雀泳道可以得到,可以看出,流量全部流向v2版本应用:
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: canary' -H 'my-request-id: 10002' % -> mocka(version: v2, ip: 172.16.0.88)-> mockb(version: v2, ip: 172.16.0.78)-> mockc(version: v2, ip: 172.16.0.80)
利用以下命令访问基线泳道可以得到,可以看出,流量全部流向v1版本应用:
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: base' -H 'my-request-id: 10003' % -> mocka(version: v1, ip: 172.16.0.87)-> mockb(version: v1, ip: 172.16.0.97)-> mockc(version: v1, ip: 172.16.0.89)
结束mockb应用发布
此时mockb新版本应用测试完成,通过运行如下命令继续mockb应用新版本发布,将新版本部署在基线中。
% kubectl kruise rollout approve rollout/rollouts-mockb
此时查看DestinationRule,可以看出,mockb服务中的金丝雀版本被移除。
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mocka namespace: istio-system spec: host: mocka.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base - labels: ASM_TRAFFIC_TAG: canary name: canary --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockb namespace: istio-system spec: host: mockb.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base # 金丝雀版本被移除 --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockc namespace: istio-system spec: host: mockc.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base - labels: ASM_TRAFFIC_TAG: canary name: canary
此时利用以下命令访问基线泳道可以得到,mockb的新版本已经部署在基线环境中。
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: base' -H 'my-request-id: 10004' % -> mocka(version: v1, ip: 172.16.0.87)-> mockb(version: v2, ip: 172.16.0.236)-> mockc(version: v1, ip: 172.16.0.89)
结束mocka和mockc发布
此时mocka和mockc新版本应用测试完成,通过运行如下命令继续mocka和mockc应用新版本发布,将新版本部署在基线中。
% kubectl kruise rollout approve rollout/rollouts-mocka % kubectl kruise rollout approve rollout/rollouts-mockc
此时查看DestinationRule,可以看出其中只存在基线版本的服务。
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mocka namespace: istio-system spec: host: mocka.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockb namespace: istio-system spec: host: mockb.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: labels: asm-system: "true" provider: asm swimlane-group: rollout name: trafficlabel-dr-rollout-default-mockc namespace: istio-system spec: host: mockc.default.svc.cluster.local subsets: - labels: ASM_TRAFFIC_TAG: base name: base
此时利用以下命令访问基线泳道可以看出,基线环境已经全部被新版本替代。
% curl ${ASM_GATEWAY}/mock -H 'x-asm-prefer-tag: base' -H 'my-request-id: 10005' % -> mocka(version: v2, ip: 172.16.0.97)-> mockb(version: v2, ip: 172.16.0.236)-> mockc(version: v2, ip: 172.16.0.78)
(可选)金丝雀发布回滚
在金丝雀发布过程中,如果您发现应用异常,例如mocka运行不符合预期,您可以执行下述命令进行回滚,停止新版本的发布。
kubectl kruise rollout undo rollout/rollouts-mocka
至此,便完成了一次完整的发布流程,通过结合ASM和Kruise Rollout,可以帮助您更好的实现渐进式发布,在一套环境内对不同版本应用进行隔离并进行流量路由,从而更加稳定,低成本的实现全链路灰度测试。
相关链接:
[1]Kruise Rollout
https://github.com/openkruise/rollouts
[2]管理网关规则
https://help.aliyun.com/zh/asm/user-guide/manage-istio-gateways
我们是阿里巴巴云计算和大数据技术幕后的核心技术输出者。
获取关于我们的更多信息~