前面已经分析了RibbonSpringCloud环境下纯Ribbon(不包含Eureka)使用与启动,接下来我们来看看重试的配置:
SpringCloud环境下纯Ribbon(不包含Eureka)重试配置
示例项目
增加重试的配置很简单,只要将Spring-retry的依赖添加至classpath即可。
<dependency> <groupId>org.springframework.retry</groupId> <artifactId>spring-retry</artifactId> </dependency>
需要注意,Spring-retry这个依赖被添加之后,如果你的项目还在使用Feign和Zuul,那么Feign和Zuul也会同时拥有重试机制。这个我们在之后的章节会讨论。
加入这个依赖,并在application.properties增加配置:
#最多重试多少台服务器 default-test.ribbon.MaxAutoRetriesNextServer=1 #每台服务器最多重试次数,但是首次调用不包括在内 default-test.ribbon.MaxAutoRetries=1 #哪些状态码会重试 default-test.ribbon.retryableStatusCodes=500 #是否所有方法都重试,如果为false那么只重试get方法 default-test.ribbon.OkToRetryOnAllOperations=true
再启动应用,我们会发现,触发了重试。
虽然之前在上一节里面,我们看到RibbonClientConfiguration虽然初始化了一个DefaultRetryHandler这样的Bean,但是其实,没啥用,必须引入Spring-retry这个依赖,上面的重试配置才会生效。
源码分析
在加入spring-retry这个依赖之后,之前提到的LoadBalanceInterceptor被替换成了RetryLoadBalancerInterceptor,在这里intercept的代码如下:
RetryLoadBalancerInterceptor.java
public ClientHttpResponse intercept(final HttpRequest request, final byte[] body, final ClientHttpRequestExecution execution) throws IOException { final URI originalUri = request.getURI(); final String serviceName = originalUri.getHost(); Assert.state(serviceName != null, "Request URI does not contain a valid hostname: " + originalUri); //利用微服务名称还有LoadBalancerClient,创建一个LoadBalancedRetryPolicy final LoadBalancedRetryPolicy retryPolicy = lbRetryPolicyFactory.create(serviceName, loadBalancer); //创建一个RetryTemplate RetryTemplate template = this.retryTemplate == null ? new RetryTemplate() : this.retryTemplate; //设置如果重试次数用尽,是否将最后一个Exception抛出 template.setThrowLastExceptionOnExhausted(true); //设置retryPolicy template.setRetryPolicy( !lbProperties.isEnabled() || retryPolicy == null ? new NeverRetryPolicy() : new InterceptorRetryPolicy(request, retryPolicy, loadBalancer, serviceName)); //设置执行回调并执行请求 return template .execute(new RetryCallback<ClientHttpResponse, IOException>() { @Override public ClientHttpResponse doWithRetry(RetryContext context) throws IOException { ServiceInstance serviceInstance = null; //之后我们会看到,在调用异常,发生重试并切换server时,就是从LoadBalancer中重新choose一个Server并放入这个lbContext,在这里把他取出来 if (context instanceof LoadBalancedRetryContext) { LoadBalancedRetryContext lbContext = (LoadBalancedRetryContext) context; serviceInstance = lbContext.getServiceInstance(); } //如果是null代表是第一次被回调 if (serviceInstance == null) { serviceInstance = loadBalancer.choose(serviceName); } //在这个serviceInstance上面执行Request ClientHttpResponse response = RetryLoadBalancerInterceptor.this.loadBalancer.execute( serviceName, serviceInstance, requestFactory.createRequest(request, body, execution)); //获取返回statusCode int statusCode = response.getRawStatusCode(); //检查是否需要并可以重试 if(retryPolicy != null && retryPolicy.retryableStatusCode(statusCode)) { response.close(); throw new RetryableStatusCodeException(serviceName, statusCode); } return response; } }); }
创建的LoadBalancedRetryPolicy是重试的Policy,就是RibbonLoadBalancedRetryPolicy:
RibbonLoadBalancedRetryPolicy.java
public static final IClientConfigKey<String> RETRYABLE_STATUS_CODES = new CommonClientConfigKey<String>("retryableStatusCodes") {}; private int sameServerCount = 0; private int nextServerCount = 0; private String serviceId; private RibbonLoadBalancerContext lbContext; private ServiceInstanceChooser loadBalanceChooser; List<Integer> retryableStatusCodes = new ArrayList<>(); public RibbonLoadBalancedRetryPolicy(String serviceId, RibbonLoadBalancerContext context, ServiceInstanceChooser loadBalanceChooser, IClientConfig clientConfig) { this.serviceId = serviceId; this.lbContext = context; this.loadBalanceChooser = loadBalanceChooser; String retryableStatusCodesProp = clientConfig.getPropertyAsString(RETRYABLE_STATUS_CODES, ""); String[] retryableStatusCodesArray = retryableStatusCodesProp.split(","); for(String code : retryableStatusCodesArray) { if(!StringUtils.isEmpty(code)) { try { retryableStatusCodes.add(Integer.valueOf(code.trim())); } catch (NumberFormatException e) { //TODO log } } } }
在这个项目里,我们设置了retryableStatusCodes是500.
我们再来看RetryTemplate的执行Request的方法:
protected <T, E extends Throwable> T doExecute(RetryCallback<T, E> retryCallback, RecoveryCallback<T> recoveryCallback, RetryState state) throws E, ExhaustedRetryException { //初始化代码略 //当还可以retry,并且没有耗尽重试次数 while (canRetry(retryPolicy, context) && !context.isExhaustedOnly()) { try { //初始化,清除掉lastException,因为最后超过重试次数还没成功的话,要抛出最后的lastException lastException = null; //回调上面的执行回调,如果调用成功就会返回这个响应 //上面回调会检查响应码,在我们上面项目配置中,如果响应码为500,就会抛出RetryableStatusCodeException,其他响应吗因为没有抛出异常,所以不会触发重试 return retryCallback.doWithRetry(context); } catch (Throwable e) { //捕获异常,记录到lastException lastException = e; try { //主要就是调用Policy的registerThrowable registerThrowable(retryPolicy, state, context, e); } catch (Exception ex) { throw new TerminatedRetryException("Could not register throwable", ex); } finally { doOnErrorInterceptors(retryCallback, context, e); } } //后续处理代码略... }
我们先来看看canRetry的判断是怎么判断的: 其实就是RibbonLoadBalancedRetryPolicy的canRetry方法:
RibbonLoadBalancedRetryPolicy.java
public boolean canRetry(LoadBalancedRetryContext context) { HttpMethod method = context.getRequest().getMethod(); //如果是Get方法,就能Retry //或者配置了isOkToRetryOnAllOperations为true(默认是false),就无论什么httpmethod都重试 return HttpMethod.GET == method || lbContext.isOkToRetryOnAllOperations(); }
抛出RetryableStatusCodeException需要判断的retryPolicy.retryableStatusCode(statusCode)
: RibbonLoadBalancedRetryPolicy.java
public boolean retryableStatusCode(int statusCode) { //retryableStatusCodes就是default-test.ribbon.retryableStatusCodes=500这个配置 return retryableStatusCodes.contains(statusCode); }
在调用发生异常的时候,会调用这个Policy的registerThrowable方法:
public void registerThrowable(LoadBalancedRetryContext context, Throwable throwable) { //检查 if(!canRetrySameServer(context) && canRetryNextServer(context)) { context.setServiceInstance(loadBalanceChooser.choose(serviceId)); } //当sameServerCount大于等于MaxRetriesOnSameServer(就是default-test.ribbon.MaxAutoRetries这个配置)并且可以retry的时候 if(sameServerCount >= lbContext.getRetryHandler().getMaxRetriesOnSameServer() && canRetry(context)) { //由于切换到下一个server,所以就置零sameServerCount sameServerCount = 0; nextServerCount++; //如果不能canRetryNextServer,就表明耗尽重试次数 if(!canRetryNextServer(context)) { context.setExhaustedOnly(); } } else { sameServerCount++; } }
总结起来,流程如下图所示: