前言
在日常开发中,限流是高并发系统的三把守护利器之一,它的另外两个好兄弟缓存、降级下次再说。而限流在绝大多数场景中用来限制并发和请求量,像秒杀之类的高流量业务的场景,都能见到它的身影,所以它就是保护系统和下游的业务系统不被流量冲垮的利器。
令牌桶限流
常见的限流方式有漏桶限流跟令牌桶限流,我们先来回顾一下,什么是令牌桶限流?
令牌桶算法则是一个存放固定容量令牌的桶,按照固定速率往桶里添加令牌。桶中存放的令牌数有最大上限,超出之后就被丢弃或者拒绝。当流量或者网络请求到达时,每个请求都要获取一个令牌,如果能够获取到,则直接处理,并且令牌桶删除一个令牌。如果获取不同,该请求就要被限流,要么直接丢弃,要么在缓冲区等待。
RateLimiter
Google Guava提供了RateLimiter类,它实现了单机版也就是单点的令牌桶限流,你如果需要分布式集群的限流功能,恐怕RateLimiter无法满足你,而你则需要利用Reids或者Sentinel之类的中间件来实现如此。
Guava RateLimiter类:http://ifeve.com/guava-ratelimiter/
核心变量
/**
* 当前令牌数,数量不能超过最大许可数
* The currently stored permits.
*/
double storedPermits;
/**
* 最大许可数
* The maximum number of stored permits.
*/
double maxPermits;
/**
* 添加令牌时间间隔毫秒, stableIntervalMicros = 1s/permitsPerSecond
* The interval between two unit requests, at our stable rate. E.g., a stable rate of 5 permits
* per second has a stable interval of 200ms.
*/
double stableIntervalMicros;
/**
* 表示下一次允许补充许可的时间
* The time when the next request (no matter its size) will be granted. After granting a
* request, this is pushed further in the future. Large requests push this further than small
* requests.
*/
private long nextFreeTicketMicros = 0L; // could be either in the past or future
create
- Guava提供了两个工厂方法来创建一个稳定输出令牌的限流器,保证了平均每秒不超过permitsPerSecond个请求。
- 这两个方法对应了两种模式,稳定模式(SmoothBursty)、渐进模式(SmoothWarmingUp)。
// RateLimiter提供了两个工厂方法,最终会调用下面两个函数,生成RateLimiter的两个子类。
static RateLimiter create(SleepingStopwatch stopwatch, double permitsPerSecond) {
// 稳定模式:令牌生成速度恒定
RateLimiter rateLimiter = new SmoothBursty(stopwatch, 1.0 /* maxBurstSeconds */);
rateLimiter.setRate(permitsPerSecond);
return rateLimiter;
}
static RateLimiter create(SleepingStopwatch stopwatch,
double permitsPerSecond,
long warmupPeriod,
TimeUnit unit,
double coldFactor) {
// 渐进模式:令牌生成速度缓慢提升直到维持在一个稳定值
// 它启动后会有一段预热期,逐步将分发频率提升到配置的速率。
// 这种功能适合系统刚启动需要一点时间来“热身”的场景,且如果被长期闲置不用,它将回到冷却状态。
RateLimiter rateLimiter = new SmoothWarmingUp(stopwatch, warmupPeriod, unit, coldFactor);
rateLimiter.setRate(permitsPerSecond);
return rateLimiter;
}
比如下面代码中的例子,创建一个平均分发令牌速率为2(单个令牌的发放时间就是:1s/2个令牌=0.5s),预热期为4s。由于设置了预热时间是4秒,一开始并不会0.5s发一个令牌,而是形成一个平滑线性下降的坡度,频率越来越高,在4秒钟之内达到原本设置的频率,以后就以固定的频率输出。这种功能适合系统刚启动需要一点时间来预热的场景。
public static void main(String[] args) throws Exception {
RateLimiter rateLimiter = RateLimiter.create(2, 4, TimeUnit.SECONDS);
while (true) {
// get 1 tokens: 1.372347s
// get 1 tokens: 1.11549s
// get 1 tokens: 0.872058s
// get 1 tokens: 0.621273s
// get 1 tokens: 0.494724s
// get 1 tokens: 0.496234s
// get 1 tokens: 0.49788s
// get 1 tokens: 0.497621s
// get 1 tokens: 0.496468s
// get 1 tokens: 0.495476s
// get 1 tokens: 0.494625s
// get 1 tokens: 0.495438s
System.out.println("get 1 tokens: " + rateLimiter.acquire(1) + "s");
}
}
它底层是通过SmoothWarmingUp来实现带有预热期的平滑限流的,它启动后会有一段预热期,逐步将分发频率提升到配置的速率,你可以在源码中找到如下注释,@see com.google.common.util.concurrent.SmoothRateLimiter.SmoothWarmingUp
/**
* This implements the following function where coldInterval = coldFactor * stableInterval.
*
* ^ throttling
* |
* cold + /
* interval | /.
* | / .
* | / . <-- "warmup period" is the area of the trapezoid between
* | / . thresholdPermits and maxPermits
* | / .
* | / .
* | / .
* stable +----------/ WARM .
* interval | . UP .
* | . PERIOD.
* | . .
* 0 +----------+-------+--------------> storedPermits
* 0 thresholdPermits maxPermits
* Before going into the details of this particular function, let's keep in mind the basics:
* 1) The state of the RateLimiter (storedPermits) is a vertical line in this figure.
* 2) When the RateLimiter is not used, this goes right (up to maxPermits)
* 3) When the RateLimiter is used, this goes left (down to zero), since if we have storedPermits,
* we serve from those first
* 4) When _unused_, we go right at a constant rate! The rate at which we move to
* the right is chosen as maxPermits / warmupPeriod. This ensures that the time it takes to
* go from 0 to maxPermits is equal to warmupPeriod.
* 5) When _used_, the time it takes, as explained in the introductory class note, is
* equal to the integral of our function, between X permits and X-K permits, assuming
* we want to spend K saved permits.
*
* In summary, the time it takes to move to the left (spend K permits), is equal to the
* area of the function of width == K.
*
* Assuming we have saturated demand, the time to go from maxPermits to thresholdPermits is
* equal to warmupPeriod. And the time to go from thresholdPermits to 0 is
* warmupPeriod/2. (The reason that this is warmupPeriod/2 is to maintain the behavior of
* the original implementation where coldFactor was hard coded as 3.)
*
* It remains to calculate thresholdsPermits and maxPermits.
*
* - The time to go from thresholdPermits to 0 is equal to the integral of the function between
* 0 and thresholdPermits. This is thresholdPermits * stableIntervals. By (5) it is also
* equal to warmupPeriod/2. Therefore
*
* thresholdPermits = 0.5 * warmupPeriod / stableInterval.
*
* - The time to go from maxPermits to thresholdPermits is equal to the integral of the function
* between thresholdPermits and maxPermits. This is the area of the pictured trapezoid, and it
* is equal to 0.5 * (stableInterval + coldInterval) * (maxPermits - thresholdPermits). It is
* also equal to warmupPeriod, so
*
* maxPermits = thresholdPermits + 2 * warmupPeriod / (stableInterval + coldInterval).
*/
- 横坐标是当前令牌桶中的令牌storedPermits, 它将storedPermits分为两个区间,分别为:[0, thresholdPermits) 和 [thresholdPermits, maxPermits]。
- 纵坐标是请求的间隔时间,stableInterval就是1/QPS,例如设置的QPS为5,则stableInterval就是200ms,coldInterval = stableInterval * coldFactor,这里的coldFactor"hard-code"默认就是3。
- 它实现预热缓冲的关键在于下发令的牌的速率会随时间和令牌数而改变,速率会先慢后快。令牌刷新的时间间隔由长逐渐变短。等存储令牌数从maxPermits到达thresholdPermits时,发放令牌的时间间隔也由coldInterval降低到了正常的stableInterval。
acquire
- acquire()获取1个令牌的方法,该方法会被阻塞直到获取到。acquire(int permits)为获取指定令牌的方法,同样也是会被阻塞直到获取到。
public static void main(String[] args) throws Exception {
final RateLimiter rateLimiter = RateLimiter.create(2);
while (true) {
// get 1 tokens: 0.497635s
// get 1 tokens: 0.493058s
// get 1 tokens: 0.49495s
// get 1 tokens: 0.497312s
// get 1 tokens: 0.495415s
// get 1 tokens: 0.495031s
System.out.println("get 1 tokens: " + rateLimiter.acquire() + "s");
}
}
注意:因为这种方式,先来的流量一般都能执行,被拒绝的基本都是后来的流量了,所以每个请求被执行的概率其实不一样的,所以就没有公平性可言。
tryAcquire
**尝试获取令牌,分为待超时时间和不带超时时间两种:**
- tryAcquire():从RateLimiter 获取许可,如果该许可可以在无延迟下的情况下立即获取得到的话。
- tryAcquire(int permits):从RateLimiter 获取许可数,如果该许可数可以在无延迟下的情况下立即获取得到的话。
- tryAcquire(int permits, long timeout, TimeUnit unit):从RateLimiter 获取指定许可数如果该许可数可以在不超过timeout的时间内获取得到的话,或者如果无法在timeout过期之前获取得到许可数的话,那么立即返回false(无需等待)。
- tryAcquire(long timeout, TimeUnit unit):从RateLimiter 获取许可如果该许可可以在不超过timeout的时间内获取得到的话,或者如果无法在timeout过期之前获取得到许可的话,那么立即返回false(无需等待)。
public static void main(String[] args) throws Exception {
final RateLimiter rateLimiter = RateLimiter.create(2);
while (true) {
if (rateLimiter.tryAcquire()) {
doSomething();
} else {
doSomethingElse();
}
}
}
惰性计算
RateLimiter中采取的是惰性计算方式,在每次请求进来的时候先去计算上次请求和本次请求之间应该生成多少个令牌。它具体实现是通过其resync方法实现的,resync函数用于增加存储令牌,核心逻辑就是 (nowMicros-nextFreeTicketMicros)/stableIntervalMicros。当前时间大于nextFreeTicketMicros时进行刷新,否则直接返回,具体逻辑如下,请阅读相关源码,在与其参考。
/**
* Updates {@code storedPermits} and {@code nextFreeTicketMicros} based on the current time.
*/
void resync(long nowMicros) {
// 当前时间晚于nextFreeTicketMicros,所以刷新令牌和nextFreeTicketMicros
if (nowMicros > nextFreeTicketMicros) {
// coolDownIntervalMicros函数获取每秒生成一个令牌,SmoothWarmingUp和SmoothBuresty的实现不同
// SmoothBuresty的coolDownIntervalMicros直接返回stableIntervalMicros
// 当前时间减去要更新令牌的时间获取时间间隔,再除以添加令牌时间间隔获取这段时间内要添加的令牌数
storedPermits = min(maxPermits,
storedPermits
+ (nowMicros - nextFreeTicketMicros) / coolDownIntervalMicros());
nextFreeTicketMicros = nowMicros;
}
// 如果当前时间早于nextFreeTicketMicros,则获取令牌的线程要一直等待到nextFreeTicketMicros,该线程获取令牌所需
// 额外等待的时间由下一次获取的线程来代替等待。
}
double coolDownIntervalMicros() {
// 添加令牌时间间隔
return stableIntervalMicros;
}