通过在请求处理过程中插入额外的External Processing插件,Gateway with Inference Extension可以对接阿里云内容安全来检测生成式AI输出输出内容审查,保证AI应用的内容合法合规。
前提条件
- 已经参考快速体验Gateway with Inference Extension智能推理路由,搭建Gateway with Inference Extension快速体验环境。
- 已经开通阿里云内容安全文本审核增强版服务,并为RAM子账号授权,具体操作,参考面向大语言模型的文本审核PLUS服务。
操作步骤
步骤一:部署ACKTrafficFilter 声明插件服务
- 使用以下yaml,创建acktrafficfilter.yaml文件。
apiVersion: inferenceextension.alibabacloud.com/v1alpha1 kind: ACKTrafficFilter metadata: name: aisg spec: aiContentSecurity: accessKey: XXXXXX secretKey: XXXXXX aliyunEndpoint: green-cip-vpc.cn-hangzhou.aliyuncs.com
其中插件服务的关键启动参数需要根据实际环境修改,参数说明如下:
参数 |
描述 |
accessKey |
具有AliyunYundunGreenWebFullAccess权限的子账号AK |
secretKey |
具有AliyunYundunGreenWebFullAccess权限的子账号SK |
aliyunEndpoint |
阿里云内容安全endpoint的域名,可参考面向大语言模型的文本审核PLUS服务获取接入域名。 |
- 执行以下指令,在集群中部署插件服务。
kubectl apply -f acktrafficfilter.yaml
步骤二:在httproute中接入ACKTrafficFilter插件对接内容安全审查
- 使用以下yaml,创建 httproute.yaml文件。
apiVersion: gateway.networking.k8s.io/v1 kind: HTTPRoute metadata: name: mock-route spec: parentRefs: - group: gateway.networking.k8s.io kind: Gateway name: mock-gateway sectionName: llm-gw rules: - backendRefs: - group: inference.networking.x-k8s.io kind: InferencePool name: mock-pool filters: - type: ExtensionRef extensionRef: group: inferenceextension.alibabacloud.com kind: ACKTrafficFilter name: aisg matches: - path: type: PathPrefix value: /
- 执行以下命令,更新集群中的HTTPRoute资源,在路由中通过
filters
引用ACKTrafficFilter资源对接内容审查服务。
kubectl apply -f httproute.yaml
步骤三:验证内容审查效果
- 获取网关IP
export GATEWAY_ADDRESS=$(kubectl get gateway/mock-gateway -o jsonpath='{.status.addresses[0].value}') echo ${GATEWAY_ADDRESS}
- 从sleep应用中发起访问
kubectl exec deployment/sleep -it -- curl -X POST ${GATEWAY_ADDRESS}/v1/chat/completions \ -H 'Content-Type: application/json' -H "host: example.com" -v -d '{ "model": "mock", "max_completion_tokens": 100, "temperature": 0, "messages": [ { "role": "user", "content": "<替换为任意违规内容>" } ] }'
预期输出:
* Trying 192.168.12.230:80... * Connected to 192.168.12.230 (192.168.12.230) port 80 > POST /v1/chat/completions HTTP/1.1 > Host: example.com > User-Agent: curl/8.8.0 > Accept: */* > Content-Type: application/json > Content-Length: 184 > * upload completely sent off: 184 bytes < HTTP/1.1 200 OK < date: Tue, 27 May 2025 08:21:37 GMT < server: uvicorn < content-length: 354 < content-type: application/json < * Connection #0 to host 192.168.12.230 left intact {"id": "chatcmpl-EhVEIn8VZAMbAUGyoXHZNltTFH417","object":"chat.completion","model":"from-security-guard","choices":[{"index":0,"message":{"role":"assistant","content":"作为人工智能,我不会对涉及色情、暴力、政治等敏感话题进行回答。如果您有其他问题需要帮助,可以继续提问。"},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}