> Retries 是在服务请求的过程中产生异常后最简单的处理方法,就是让服务再次重试,通过重试可以提供服务的可用性和健壮性。
#### 什么场景需要用到重试
重试是解决很多请求异常最直接、简单的方法,尤其是在工作环境比较复杂的场景下,克提高总体的服务质量。重试使用不当也会有问题,最糟糕的情况是重试一直不成功,反而增加延迟和性能开销。所以根据系统运行环境、服务自身特点,配置适当的重试规则显得尤为重要。
#### 通过例子来理解
nginx 服务访问 httpd 服务,但 httpd 服务由于自身故障错误响应 nginx 服务,nginx 服务为了提高容错率则在等待了15秒之后重新发起第一次重试,如果还是为有响应,则在试第二次,如果还是没有响应则请求失败。在大多数场景下,由于故障不是恒定的,而是瞬时出现而后自动恢复的,则可以通过重试去提供服务的可用性。
apiVersion apps/v1 kind Deployment metadata labels app nginx-deployment name nginx-deployment spec replicas1 selector matchLabels app nginx-deployment strategy rollingUpdate maxSurge 25% maxUnavailable 25% type RollingUpdate template metadata labels app nginx-deployment spec containersimage'nginx:latest' name nginx-deployment ---apiVersion apps/v1 kind Deployment metadata labels app httpd-deployment name httpd-deployment spec replicas1 selector matchLabels app httpd-deployment strategy rollingUpdate maxSurge 25% maxUnavailable 25% type RollingUpdate template metadata labels app httpd-deployment spec containersimage'httpd:latest' name httpd-deployment ---apiVersion v1 kind Service metadata name nginx-service spec selector app nginx-deployment type ClusterIP portsname http port80 targetPort80 protocol TCP ---apiVersion v1 kind Service metadata name httpd-service spec selector app httpd-deployment type ClusterIP portsname http port80 targetPort80 protocol TCP ---apiVersion networking.istio.io/v1alpha3 kind VirtualService metadata name nginx-vs spec hosts nginx-service httproutedestination host nginx-service retries attempts3 perTryTimeout 5s ---apiVersion networking.istio.io/v1alpha3 kind VirtualService metadata name httpd-vs spec hosts httpd-service httpfault abort percentage value100 httpStatus503 routedestination host httpd-service ``` ##### 配置 nginx 反向代理 httpd 以完成上下游调用的效果kubectl exec -it nginx-deployment-56c94b9957-xgw88 -- sh tee /etc/nginx/conf.d/default.conf <<-'EOF'server listen 80; server_name localhost; location / proxy_pass http://httpd-service; proxy_http_version 1.1; EOF nginx -t ; nginx -s reload ##### 进入客户端容器测试```yaml apiVersion apps/v1 kind Deployment metadata labels app client-deployment name client-deployment spec replicas1 selector matchLabels app client-deployment strategy rollingUpdate maxSurge 25% maxUnavailable 25% type RollingUpdate template metadata labels app client-deployment spec containersimage'busybox:latest' name client-deployment command"/bin/sh""-c""sleep 3600"
通过 ``kubectl exec -it client-deployment-56c94b9957-xgw88 -- sh`` 进入容器
执行 ``wget -q -O - http://nginx-service`` 测试
执行 ``kubectl logs -f nginx-deployment-86684f9cf6-lvxtj -c istio-proxy -n demo`` 观测边车日志
本身请求一次,加上三次重试
#### 重试相关参数配置
##### retries 参数
可以定义请求失败时的策略,重试策略包括重试次数、超时、重试条件
attempts: 必选字段,定义重试的次数
perTryTimeout: 单次重试超时的时间,单位可以是ms、s、m和h
retryOn: 重试的条件,可以是多个条件,以逗号分隔
##### retryOn 参数
5xx:在上游服务返回5xx应答码,或者在没有返回时重试
gateway-error:类似于5xx异常,只对502、503和504应答码进行重试
connect-failure:在链接上游服务失败时重试
retriable-4xx:在上游服务返回可重试的4xx应答码时执行重试
refused-stream:在上游服务使用REFUSED_STREAM错误码重置时执行重试
cancelled:gRPC应答的Header中状态码是cancelled时执行重试
deadline-exceeded:在gRPC应答的Header中状态码是deadline-exceeded时执行重试
internal:在gRPC应答的Header中状态码是internal时执行重试
resource-exhausted:在gRPC应答的Header中状态码是resource-exhausted时执行重试
unavailable:在gRPC应答的Header中状态码是unavailable时执行重试。