可靠性利器-重试机制

简介: 在日常开发中,我们经常会遇到需要调用外部服务和接口的场景。外部服务对于调用者来说一般都是不可靠的,尤其是在网络环境比较差的情况下,网络抖动很容易导致请求超时等异常情况,这时候就需要使用失败重试策略重新调用 API 接口来获取。重试策略在服务治理方面也有很广泛的使用,通过定时检测,来查看服务是否存活。

Spring异常重试框架Spring Retry


Spring Retry支持集成到Spring或者Spring Boot项目中,而它支持AOP的切面注入写法,所以在引入时必须引入aspectjweaver.jar包。


1.引入maven依赖


<dependency>
    <groupId>org.springframework.retry</groupId>
    <artifactId>spring-retry</artifactId>
    <version>1.1.2.RELEASE</version>
</dependency>
<dependency>
    <groupId>org.aspectj</groupId>
    <artifactId>aspectjweaver</artifactId>
    <version>1.8.6</version>
</dependency>


2.添加@Retryable和@Recover注解


@Retryable注解,被注解的方法发生异常时会重试


  • value:指定发生的异常进行重试
  • include:和value一样,默认空,当exclude也为空时,所有异常都重试
  • exclude:指定异常不重试,默认空,当include也为空时,所有异常都重试
  • maxAttemps:重试次数,默认3
  • backoff:重试补偿机制,默认没有


@Backoff注解


  • delay:指定延迟后重试
  • multiplier:指定延迟的倍数,比如delay=5000l,multiplier=2时,第一次重试为5秒后,第二次为10秒,第三次为20秒


@Recover注解: 当重试到达指定次数时,被注解的方法将被回调,可以在该方法中进行日志处理。需要注意的是发生的异常和入参类型一致时才会回调。


@Service
public class RemoteService {
@Retryable(value = {Exception.class}, maxAttempts = 5, backoff = @Backoff(delay = 5000L, multiplier = 1))
public void call() {
    System.out.println(LocalDateTime.now() + ": do something...");
    throw new RuntimeException(LocalDateTime.now() + ": 运行调用异常");
}
@Recover
public void recover(Exception e) {
    System.out.println(e.getMessage());
}


3.启用重试功能


启动类上面添加@EnableRetry注解,启用重试功能,或者在使用retry的service上面添加也可以,或者Configuration配置类上面。


建议所有的Enable配置加在启动类上,可以清晰的统一管理使用的功能。


@SpringBootApplication
@EnableRetry
public class App {
    public static void main(String[] args) throws Exception {
        ConfigurableApplicationContext context = SpringApplication.run(App.class, args);
        System.out.println("Start app success.");
        RemoteService bean = context.getBean(RemoteService.class);
        bean.call();
    }
}


4.启动服务,运行测试


通过在启动类Context调用服务看到如下打印:

2019-03-09T15:22:12.781: do something...
2019-03-09T15:22:17.808: do something...
2019-03-09T15:22:22.835: do something...
2019-03-09T15:22:27.861: do something...
2019-03-09T15:22:32.887: do something...
2019-03-09T15:22:32.887: 运行调用异常


基于guava的重试组件Guava-Retryer


直接看组件作者对此组件的介绍: This is a small extension to Google’s Guava library to allow for the creation of configurable retrying strategies for an arbitrary function call, such as something that talks to a remote service with flaky uptime.(这是对Google的guava库的一个小扩展,允许为任意函数调用创建可配置的重试策略,例如与运行时间不稳定的远程服务对话的策略。)


第一步引入maven坐标:


<dependency>
    <groupId>com.github.rholder</groupId>
    <artifactId>guava-retrying</artifactId>
    <version>2.0.0</version>
</dependency>


1.其主要接口及策略介绍


  • Attempt:一次执行任务;
  • AttemptTimeLimiter:单次任务执行时间限制(如果单次任务执行超时,则终止执行当前任务);
  • BlockStrategies:任务阻塞策略(通俗的讲就是当前任务执行完,下次任务还没开始这段时间做什么……),默认策略为:BlockStrategies.THREAD_SLEEP_STRATEGY 也就是调用 Thread.sleep(sleepTime);
  • RetryException:重试异常;
  • RetryListener:自定义重试监听器,可以用于异步记录错误日志;
  • StopStrategy:停止重试策略,提供三种:
  • StopAfterDelayStrategy :设定一个最长允许的执行时间;比如设定最长执行10s,无论任务执行次数,只要重试的时候超出了最长时间,则任务终止,并返回重试异常RetryException;
  • NeverStopStrategy :不停止,用于需要一直轮训直到返回期望结果的情况;
  • StopAfterAttemptStrategy :设定最大重试次数,如果超出最大重试次数则停止重试,并返回重试异常;
  • WaitStrategy:等待时长策略(控制时间间隔),返回结果为下次执行时长:
  • FixedWaitStrategy:固定等待时长策略;
  • RandomWaitStrategy:随机等待时长策略(可以提供一个最小和最大时长,等待时长为其区间随机值)
  • IncrementingWaitStrategy:递增等待时长策略(提供一个初始值和步长,等待时间随重试次数增加而增加)
  • ExponentialWaitStrategy:指数等待时长策略;
  • FibonacciWaitStrategy :Fibonacci 等待时长策略;
  • ExceptionWaitStrategy :异常时长等待策略;
  • CompositeWaitStrategy :复合时长等待策略;


2.根据结果判断是否重试


使用场景:如果返回值决定是否要重试。

重试接口:

private static Callable<String> callableWithResult() {
    return new Callable<String>() {
        int counter = 0;
        public String call() throws Exception {
            counter++;
            System.out.println(LocalDateTime.now() + ": do something... " + counter);
            if (counter < 5) {
                return "james";
            }
            return "kobe";
        }
    };
}

测试:

public static void main(String[] args) {
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfResult(result -> !result.contains("kobe")).build();
    retry.call(callableWithResult());
}

输出:

2019-03-09T15:40:23.706: do something... 1
2019-03-09T15:40:23.710: do something... 2
2019-03-09T15:40:23.711: do something... 3
2019-03-09T15:40:23.711: do something... 4
2019-03-09T15:40:23.711: do something... 5


3.根据异常判断是否重试


使用场景:根据抛出异常类型判断是否执行重试。 重试接口:

private static Callable<String> callableWithResult() {
    return new Callable<String>() {
        int counter = 0;
        public String call() throws Exception {
            counter++;
            System.out.println(LocalDateTime.now() + ": do something... " + counter);
            if (counter < 5) {
                throw new RuntimeException("Run exception");
            }
            return "kobe";
        }
    };
}

测试:

public static void main(String[] args) throws ExecutionException, RetryException{
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.neverStop())
            .build();
    retry.call(callableWithResult());
}

输出:

2019-03-09T15:53:27.682: do something... 1
2019-03-09T15:53:27.686: do something... 2
2019-03-09T15:53:27.686: do something... 3
2019-03-09T15:53:27.687: do something... 4
2019-03-09T15:53:27.687: do something... 5


4.重试策略——设定无限重试


使用场景:在有异常情况下,无限重试(默认执行策略),直到返回正常有效结果;

Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.neverStop())
            .build();
    retry.call(callableWithResult());


5.重试策略——设定最大的重试次数


使用场景:在有异常情况下,最多重试次数,如果超过次数则会抛出异常;

private static Callable<String> callableWithResult() {
    return new Callable<String>() {
        int counter = 0;
        public String call() throws Exception {
            counter++;
            System.out.println(LocalDateTime.now() + ": do something... " + counter);
            throw new RuntimeException("Run exception");
        }
    };
}

测试:

public static void main(String[] args) throws ExecutionException, RetryException{
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(4))
            .build();
    retry.call(callableWithResult());
}

输出:

2019-03-09T16:02:29.471: do something... 1
2019-03-09T16:02:29.477: do something... 2
2019-03-09T16:02:29.478: do something... 3
2019-03-09T16:02:29.478: do something... 4
Exception in thread "main" com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 4 attempts.


6.等待策略——设定重试等待固定时长策略


使用场景:设定每次重试等待间隔固定为10s;

public static void main(String[] args) throws ExecutionException, RetryExceptio{
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(4))
            .withWaitStrategy(WaitStrategies.fixedWait(10, TimeUnit.SECONDS))
            .build();
    retry.call(callableWithResult());
}

测试输出,可以看出调用间隔是10S:

2019-03-09T16:06:34.457: do something... 1
2019-03-09T16:06:44.660: do something... 2
2019-03-09T16:06:54.923: do something... 3
2019-03-09T16:07:05.187: do something... 4
Exception in thread "main" com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 4 attempts.


7.等待策略——设定重试等待时长固定增长策略


场景:设定初始等待时长值,并设定固定增长步长,但不设定最大等待时长;

public static void main(String[] args) throws ExecutionException, RetryException {
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(4))
            .withWaitStrategy(WaitStrategies.incrementingWait(1, SECONDS, 1, SECONDS))
            .build();
    retry.call(callableWithResult());
}

测试输出,可以看出调用间隔时间递增1秒:

2019-03-09T18:46:30.256: do something... 1
2019-03-09T18:46:31.260: do something... 2
2019-03-09T18:46:33.260: do something... 3
2019-03-09T18:46:36.260: do something... 4
Exception in thread "main" com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 4 attempts.


8.等待策略——设定重试等待时长按指数增长策略


使用场景:根据multiplier值按照指数级增长等待时长,并设定最大等待时长;

public static void main(String[] args) throws ExecutionException, RetryExceptio{
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(4))
            .withWaitStrategy(WaitStrategies.exponentialWait(1000, 10,SECONDS))
            .build();
    retry.call(callableWithResult());
}

这个重试策略和入参不是很懂,好吧,查看源码:

@Immutable
private static final class ExponentialWaitStrategy implements WaitStrategy {
    private final long multiplier;
    private final long maximumWait;
    public ExponentialWaitStrategy(long multiplier, long maximumWait) {
        Preconditions.checkArgument(multiplier > 0L, "multiplier must be > 0 but is %d", new Object[]{Long.valueOf(multiplier)});
        Preconditions.checkArgument(maximumWait >= 0L, "maximumWait must be >= 0 but is %d", new Object[]{Long.valueOf(maximumWait)});
        Preconditions.checkArgument(multiplier < maximumWait, "multiplier must be < maximumWait but is %d", new Object[]{Long.valueOf(multiplier)});
        this.multiplier = multiplier;
        this.maximumWait = maximumWait;
    }
    public long computeSleepTime(Attempt failedAttempt) {
        double exp = Math.pow(2.0D, (double)failedAttempt.getAttemptNumber());
        long result = Math.round((double)this.multiplier * exp);
        if(result > this.maximumWait) {
            result = this.maximumWait;
        }
        return result >= 0L?result:0L;
    }
}

通过源码看出ExponentialWaitStrategy是一个不可变的内部类,构造器中校验入参,最重要的延迟时间计算方法computeSleepTime(),可以看出延迟时间计算方式


  1. 计算以2为底失败次数为指数的值


  1. 第一步的值构造器第一个入参相乘,然后四舍五入得到延迟时间(毫秒)

通过以上分析可知入参为1000时间隔是应该为2,4,8s

测试输出,可以看出调用间隔时间 2×1000,4×1000,8×1000:

2019-03-09T19:11:23.905: do something... 1
2019-03-09T19:11:25.908: do something... 2
2019-03-09T19:11:29.908: do something... 3
2019-03-09T19:11:37.909: do something... 4
Exception in thread "main" com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 4 attempts.


9.等待策略——设定重试等待时长按斐波那契数列策略


使用场景:根据multiplier值按照斐波那契数列增长等待时长,并设定最大等待时长,斐波那契数列:1、1、2、3、5、8、13、21、34、……

public static void main(String[] args) throws ExecutionException, RetryException {
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(4))
            .withWaitStrategy(WaitStrategies.fibonacciWait(1000, 10, SECONDS))
            .build();
    retry.call(callableWithResult());
}

同样,看源码可知计算可知延迟时间为斐波那契数列和第一入参的乘积(毫秒)

public long computeSleepTime(Attempt failedAttempt) {
    long fib = this.fib(failedAttempt.getAttemptNumber());
    long result = this.multiplier * fib;
    if(result > this.maximumWait || result < 0L) {
        result = this.maximumWait;
    }
    return result >= 0L?result:0L;
}

测试输出,可看出间隔调用为1×1000,1×1000,2×1000:

2019-03-09T19:28:43.903: do something... 1
2019-03-09T19:28:44.909: do something... 2
2019-03-09T19:28:45.928: do something... 3
2019-03-09T19:28:47.928: do something... 4
Exception in thread "main" com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 4 attempts.


10.等待策略——组合重试等待时长策略


使用场景:当现有策略不满足使用场景时,可以对多个策略进行组合使用。

public static void main(String[] args) throws ExecutionException,     RetryException {
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(10))
            .withWaitStrategy(WaitStrategies.join(WaitStrategies.exponentialWait(1000, 100, SECONDS)
                    , WaitStrategies.fixedWait(2, SECONDS)))
            .build();
    retry.call(callableWithResult());
}

同样,看源码才能理解组合策略是什么意思:

public long computeSleepTime(Attempt failedAttempt) {
    long waitTime = 0L;
    WaitStrategy waitStrategy;
    for(Iterator i$ = this.waitStrategies.iterator(); i$.hasNext(); waitTime += waitStrategy.computeSleepTime(failedAttempt)) {
        waitStrategy = (WaitStrategy)i$.next();
    }
    return waitTime;
}

可看出组合策略其实按照多个策略的延迟时间相加得到组合策略的延迟时间。exponentialWait的延迟时间为2,4,8,16,32...,fixedWait延迟为2,2,2,2,2...,所以总的延迟时间为4,6,10,18,34...

测试输出:

2019-03-09T19:46:45.854: do something... 1
2019-03-09T19:46:49.859: do something... 2
2019-03-09T19:46:55.859: do something... 3
2019-03-09T19:47:05.859: do something... 4
2019-03-09T19:47:23.859: do something... 5
2019-03-09T19:47:57.860: do something... 6
2019-03-09T19:49:03.861: do something... 7
2019-03-09T19:50:45.862: do something... 8


11.监听器——RetryListener实现重试过程细节处理


使用场景:自定义监听器,分别打印重试过程中的细节,未来可更多的用于异步日志记录,亦或是特殊处理。

public class MyRetryListener implements RetryListener {
@Override
public <V> void onRetry(Attempt<V> attempt) {
    System.out.println(("retry times=" + attempt.getAttemptNumber()));
    // 距离第一次重试的延迟
    System.out.println("delay=" + attempt.getDelaySinceFirstAttempt());
    // 重试结果: 是异常终止, 还是正常返回
    System.out.println("hasException=" + attempt.hasException());
    System.out.println("hasResult=" + attempt.hasResult());
    // 是什么原因导致异常
    if (attempt.hasException()) {
        System.out.println("causeBy=" + attempt.getExceptionCause());
    } else {
        // 正常返回时的结果
        System.out.println("result=" + attempt.getResult());
    }
    // 增加了额外的异常处理代码
    try {
        Object result = attempt.get();
        System.out.println("rude get=" + result);
    } catch (ExecutionException e) {
        System.out.println("this attempt produce exception." + e.getCause());
    }
}

测试:

public static void main(String[] args) throws ExecutionException, RetryException {
    Retryer<String> retry = RetryerBuilder.<String>newBuilder()
            .retryIfRuntimeException()
            .withStopStrategy(StopStrategies.stopAfterAttempt(2))
            .withRetryListener(new MyRetryListener())
            .build();
    retry.call(callableWithResult());
}

输出:

2019-03-09T16:32:35.097: do something... 1
retry times=1
delay=128
hasException=true
hasResult=false
causeBy=java.lang.RuntimeException: Run exception
this attempt produce exception.java.lang.RuntimeException: Run exception
2019-03-09T16:32:35.102: do something... 2
retry times=2
delay=129
hasException=true
hasResult=false
causeBy=java.lang.RuntimeException: Run exception
this attempt produce exception.java.lang.RuntimeException: Run exception
Exception in thread "main" com.github.rholder.retry.RetryException: Retrying failed to complete successfully after 2 attempts.


总结


两种方式都是比较优雅的重试策略,Spring-retry配置更简单,实现的功能也相对简单,Guava本身就是谷歌推出的精品java类库,guava-retry也是功能非常强大,相比较于Spring-Retry在是否重试的判断条件上有更多的选择性,可以作为Spring-retry的补充。


目录
相关文章
|
8月前
|
消息中间件 存储 监控
|
6月前
|
编译器 调度 C++
协程问题之机制保障中提到的早值班机制和稳定性周会机制分别是什么
协程问题之机制保障中提到的早值班机制和稳定性周会机制分别是什么
|
6月前
|
JavaScript 中间件
中间件重试机制
【7月更文挑战第20天】
55 1
|
8月前
|
消息中间件 监控 Java
接口请求重试策略:保障稳定性的必杀技
接口请求重试策略:保障稳定性的必杀技
406 0
|
6月前
|
分布式计算 UED 流计算
Java编程问题之重试机制问题之在使用重试机制时的问题如何解决
Java编程问题之重试机制问题之在使用重试机制时的问题如何解决
|
6月前
|
监控
稳定性摸排问题之如何保证监控的全面性和有效性
稳定性摸排问题之如何保证监控的全面性和有效性
|
7月前
|
消息中间件 中间件
中间件消息降低系统复杂性
【6月更文挑战第9天】
36 4
|
8月前
|
安全 网络虚拟化 数据安全/隐私保护
如何处理移动应用中的网络故障?
处理移动应用网络故障包括检查网络连接、设备状态和信号干扰,使用安全连接如VPN,避免公共网络,利用网络诊断工具,采用分层排除法,PPP协议排错,更新软件及用户教育。综合施策可有效解决网络问题,提升用户体验。
100 0
|
8月前
|
运维 监控 Java
微服务心跳监测机制讲解与实现,与面试过程中如何回答这个问题
微服务心跳监测机制讲解与实现,与面试过程中如何回答这个问题
204 0
|
8月前
|
存储 缓存 负载均衡
软件容错技术和方法在系统中的具体应用
软件容错技术和方法在系统中的具体应用
161 0