为了更好地满足企业日益加深的大规模使用服务网格产品、服务多语言互通、服务精细治理等需求,2022 年 4 月 1 日起,阿里云服务网格产品 ASM 正式发布商业化版本,为企业在生产环境下大规模落地服务网格能力提供性能、安全、高可用、高可靠等服务保障。
阿里云内部很早就开始调研并实践 ServiceMesh 技术,通过总结业务场景落地经验,持续驱动技术发展,积累一系列服务网格核心技术,并将其沉淀成为业界首个兼容 Istio 的托管式服务网格平台 ASM( Alibaba Cloud Service Mesh,简称 ASM),为云上用户赋能。我们在本系列文章中将会总结用户在使用服务网格技术过程中的常见典型问题及其解决方法。
背景
WebSocket 是一种用于客户端和服务器端之间进行双向通信的协议, 是基于RFC 6455标准,并被应用程序和 API 实现广泛使用。WebSocket 与 HTTP 不同,但它使用 HTTP Upgrade 标头来建立各方之间的连接。Istio 无法识别 WebSocket 协议,但 Istio Sidecar Proxy提供了开箱即用方式支持 WebSocket,无需额外配置。在下面的示例中,我们将展示运行中的 WebSocket 应用程序并检查Envoy 配置中包含的WebSocket 协议相关的配置内容 。
Envoy 对 WebSocket 的支持记录在以下链接中:
Istio 协议选择和发现在下面的链接文章中进行了描述:
我们将使用 WebSocket 社区提供的 Python 库在我们的Kubernetes集群中部署 WebSocket 应用程序。WebSocket 的文档包含了一个示例代码和 yaml 清单,我们可以使用这些文件在Kubernetes集群中创建 WebSocket 应用程序。如果需要修改这个示例中的部署配置,请按照链接(https://websockets.readthedocs.io/en/latest/howto/kubernetes.html)文章中容器化应用程序部分的步骤创建 Python 应用程序并重新构建新的Docker镜像。
然后, 修改上面示例中的部署配置, 以在 Istio 网格中运行 WebSocket 服务器。
以下是可用于部署的 yaml 文件:
apiVersion: v1
kind: Service
metadata:
name: websockets-server
labels:
app: websockets-server
spec:
type: ClusterIP
ports:
- port: 8080
targetPort: 80
name: http-websocket
selector:
app: websockets-server
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: websockets-server
labels:
app: websockets-server
spec:
selector:
matchLabels:
app: websockets-server
template:
metadata:
labels:
app: websockets-server
spec:
containers:
- name: websockets-test
image: registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/istio-websockets-test:1.0
ports:
- containerPort: 80
下一步, 需要将应用程序部署到作为 Istio 网格中已经启用自动注入Sidecar Proxy的命名空间。
现在将部署客户端 WebSocket 应用程序,可以使用它来打开 WebSocket 交互式回显通信。我们将使用相同的 WebSockets Python 库来测试我们的应用程序,但需要使用预安装的库构建镜像。
可以使用以下简单的 Dockerfile 来创建我们将使用的镜像:
FROM python:3.9-alpine
RUN pip3 install websockets
或者使用我们已经构建好的 Docker 镜像, 以下 yaml 清单可用于部署到我们的集群:
apiVersion: v1
kind: Service
metadata:
name: websockets-client
labels:
app: websockets-client
spec:
type: ClusterIP
ports:
- port: 8080
targetPort: 80
name: http-websockets-client
selector:
app: websockets-client
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: websockets-client-sleep
labels:
app: websockets-client
spec:
selector:
matchLabels:
app: websockets-client
template:
metadata:
labels:
app: websockets-client
spec:
containers:
- name: websockets-client
image: registry.cn-beijing.aliyuncs.com/aliacs-app-catalog/istio-websockets-client-test:1.0
command: ["sleep", "14d"]
将上述 yaml 文件部署到与 WebSocket 服务器应用程序相同的命名空间下。
要测试应用程序,我们需要向客户端 POD 打开一个 shell 命令:
kubectl exec -it -n <namespace> websockets-client-sleep-<xxxx> -c websockets-client -- sh
要打开到我们服务器端的 WebSocket 交互式回显连接,可以使用以下示例中的命令(注意需要替换<namespace>为实际的命名空间, 例如default):
python3 -m websockets ws://websockets-server.<namespace>.svc.cluster.local:8080
得到如下结果:
Connected to ws://websockets-server.default.svc.cluster.local:8080.
> hello
< hello
> world
< world
Connection closed: 1000 (OK).
在HTTP/1.1协议下运行
首先, 我们在Istio 1.11版本下运行上述示例, 如下显示了 WebSocket 连接如何出现在istio-proxy访问日志中。
在客户端的Sidecar Proxy日志如下:
{"method":"GET","upstream_local_address":"10.208.0.27:34200","upstream_transport_failure_reason":null,"x_forwarded_for":null,"duration":7035,"path":"/","istio_policy_status":null,"downstream_remote_address":"10.208.0.27:57510","bytes_received":35,"requested_server_name":null,"downstream_local_address":"192.168.106.4:8080","user_agent":"Python/3.9 websockets/10.2","upstream_cluster":"outbound|8080||websockets-server.default.svc.cluster.local","response_code":101,"upstream_host":"10.208.0.105:80","bytes_sent":23,"protocol":"HTTP/1.1","request_id":"dbcb8ac4-0fb0-4acf-94ef-24b31d704489","trace_id":null,"start_time":"2022-03-17T06:30:28.303Z","response_flags":"UC","route_name":"default","authority":"websockets-server.default.svc.cluster.local:8080","upstream_service_time":null}
在服务器端的Sidecar Proxy日志如下:
{"duration":6030,"bytes_received":35,"start_time":"2022-03-17T06:30:28.308Z","istio_policy_status":null,"route_name":"default","user_agent":"Python/3.9 websockets/10.2","upstream_service_time":null,"request_id":"dbcb8ac4-0fb0-4acf-94ef-24b31d704489","downstream_local_address":"10.208.0.105:80","upstream_local_address":"127.0.0.6:53983","protocol":"HTTP/1.1","requested_server_name":"outbound_.8080_._.websockets-server.default.svc.cluster.local","response_flags":"UC","upstream_cluster":"inbound|80||","upstream_host":"10.208.0.105:80","x_forwarded_for":null,"trace_id":null,"upstream_transport_failure_reason":null,"authority":"websockets-server.default.svc.cluster.local:8080","path":"/","method":"GET","downstream_remote_address":"10.208.0.27:34200","bytes_sent":23,"response_code":101}
如上所示, 即使我们向服务器端发送了多个请求,也只有一个 WebSocket 连接的访问日志条目。Envoy 只会在连接关闭后创建该访问日志条目,并且我们也不会看到每个 tcp 请求的任何消息。这是因为 Envoy 将 WebSocket 连接视为 TCP 字节流,并且代理只能理解 HTTP 升级请求/响应。
HTTP2协议升级问题
低于 1.12 的 Istio 版本在 WebSocket 的 HTTP2 升级流程方面存在问题(即不能正常连接到WebSocket服务器端, 而是返回503错误),如图所示的返回503结果。
$ python3 -m websockets ws://websockets-server.default.svc.cluster.local:8080
Failed to connect to ws://websockets-server.default.svc.cluster.local:8080: server rejected WebSocket connection: HTTP 503.
也就是说, 在定义的流量策略中, h2UpgradePolicy不能指定为UPGRADE, 而是赋值为DO_NOT_UPGRADE, 否则就返回上述503失败结果。这也就是上述访问日志中为何包括了"protocol":"HTTP/1.1"的内容。
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
labels:
provider: asm
name: websockets-server
spec:
host: websockets-server
trafficPolicy:
connectionPool:
http:
h2UpgradePolicy: DO_NOT_UPGRADE
如何支持HTTP/2
然后, 我们在Istio 1.12或者更高版本下运行上述示例, 并设置 HTTP2 升级。 包括两部分的配置, 具体如下。
1)通过DestinationRule为目标服务websockets-server设置HTTP2 升级, 即设置h2UpgradePolicy: UPGRADE, 如下所示:
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
labels:
provider: asm
name: websockets-server
spec:
host: websockets-server
trafficPolicy:
connectionPool:
http:
h2UpgradePolicy: UPGRADE
2)默认情况下,HTTP/2 对 WebSocket 的支持都是关闭的,但 Envoy 却支持 WebSocket 在 HTTP/2 流上进行隧道传输,以便在整个部署过程中可以使用统一的 HTTP/2 网络。通过EnvoyFilter针对目标工作负载设置 allow_connect打开, 即设置allow_connect: true允许升级的协议连接, 如下所示。
具体生成如下EnvoyFitler资源对象, 可以参见插件中心下的Envoy过滤器模板页, 选择其中的“设置allow_connect为true允许升级的协议连接”, 指定Patch Context为SIDECAR_INBOUND, 并绑定相应的工作负载, 即label为app: websockets-server。
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: h2-upgrade-wss
labels:
asm-system: 'true'
provider: asm
spec:
workloadSelector:
labels:
app: websockets-server
configPatches:
- applyTo: NETWORK_FILTER
match:
context: SIDECAR_INBOUND
proxy:
proxyVersion: '^1\.*.*'
listener:
filterChain:
filter:
name: "envoy.filters.network.http_connection_manager"
patch:
operation: MERGE
value:
typed_config:
'@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
http2_protocol_options:
allow_connect: true
开启h2UpgradePolicy之后, 在客户端的Sidecar Proxy日志如下。 从日志内容可以看到, "protocol"认为"HTTP/1.1", 说明客户端发起请求是"HTTP/1.1"协议。
{"bytes_received":41,"requested_server_name":null,"x_forwarded_for":null,"downstream_local_address":"192.168.224.30:8080","duration":4151,"upstream_cluster":"outbound|8080||websockets-server.default.svc.cluster.local","path":"/","authority":"websockets-server.default.svc.cluster.local:8080","method":"GET","user_agent":"Python/3.9 websockets/10.2","route_name":"default","start_time":"2022-03-18T00:58:23.255Z","bytes_sent":25,"upstream_service_time":null,"protocol":"HTTP/1.1","istio_policy_status":null,"response_code":101,"trace_id":null,"downstream_remote_address":"10.208.0.117:60092","upstream_local_address":"10.208.0.117:60618","request_id":"8a6612d9-86d2-4b28-b6e2-0be0c98d9c1f","upstream_transport_failure_reason":null,"upstream_host":"10.208.0.116:80","response_flags":"UR"}{"method":"GET","upstream_local_address":"127.0.0.6:34477","protocol":"HTTP/2","upstream_host":"10.208.0.116:80","response_code":101,"authority":"websockets-server.default.svc.cluster.local:8080","istio_policy_status":null,"x_forwarded_for":null,"bytes_sent":25,"upstream_service_time":null,"upstream_transport_failure_reason":null,"request_id":"8a6612d9-86d2-4b28-b6e2-0be0c98d9c1f","start_time":"2022-03-18T00:58:23.258Z","downstream_remote_address":"10.208.0.117:60618","requested_server_name":"outbound_.8080_._.websockets-server.default.svc.cluster.local","trace_id":null,"downstream_local_address":"10.208.0.116:80","user_agent":"Python/3.9 websockets/10.2","route_name":"default","upstream_cluster":"inbound|80||","path":"/","bytes_received":41,"response_flags":"UC","duration":4147}
开启h2UpgradePolicy之后, 在服务器端的Sidecar Proxy日志如下。从日志内容可以看到, "protocol"认为"HTTP/2", 说明服务器端的请求协议已经升级为"HTTP/2"。
{"method":"GET","upstream_local_address":"127.0.0.6:34477","protocol":"HTTP/2","upstream_host":"10.208.0.116:80","response_code":101,"authority":"websockets-server.default.svc.cluster.local:8080","istio_policy_status":null,"x_forwarded_for":null,"bytes_sent":25,"upstream_service_time":null,"upstream_transport_failure_reason":null,"request_id":"8a6612d9-86d2-4b28-b6e2-0be0c98d9c1f","start_time":"2022-03-18T00:58:23.258Z","downstream_remote_address":"10.208.0.117:60618","requested_server_name":"outbound_.8080_._.websockets-server.default.svc.cluster.local","trace_id":null,"downstream_local_address":"10.208.0.116:80","user_agent":"Python/3.9 websockets/10.2","route_name":"default","upstream_cluster":"inbound|80||","path":"/","bytes_received":41,"response_flags":"UC","duration":4147}
总结
在Istio 1.12或者更高版本下通过DestinationRule为目标服务设置HTTP/2 升级, 并通过EnvoyFilter针对目标工作负载设置 allow_connect启用开关, 那么就可以支持使用WebSocket over HTTP/2协议了。