在一个需要低延时响应的hbase集群中,使用hbase默认的客户端超时配置简直就是灾难。
但是我们可以考虑在客户端上加上如下几个参数,去改变这种状况:
1. hbase.rpc.timeout: RPC timeout, The default 60s, 可以修改为5000(5s)
2. ipc.socket.timeout: Socket link timeout, should be less than or equal to RPC timeout, the default is 20s
3. hbase.client.retries.number: The number of retries, default is 14, 可以配置为1
4. hbase.client.pause: Sleep time again, the default is 1s, can be reduced, such as 100ms(在1.1版本的hbase已经变为100ms,请对照你使用的hbase版本)
5. zookeeper.recovery.retry: The number of retries ZK, Can be adjusted to 3 times, ZK is not easy to hang, And if HBase cluster problem, Each retry retry the operation of ZK will be, The total number of retry ZK is: hbase.client.retries.number * zookeeper.recovery.retry, And sleep time each retry will have exponential growth of 2, Every time you access the HBase will try again, In a HBase operation if it involves multiple ZK access, If ZK is not available, There will be many times the ZK retry, Is a waste of time.
6. zookeeper.recovery.retry.intervalmill: Sleep time ZK retries, the default is 1s, can be reduced, for example: 200ms
7. hbase.regionserver.lease.period: A scan query when interacting with server timeout, the default is 60s.(在1.1版本,该参数名为hbase.client.scanner.timeout.period)
Retry interval strategy RPC:
public static long getPauseTime(final long pause, final int tries) {
int ntries = tries;
// RETRY_BACKOFF[] = { 1, 1, 1, 2, 2, 4, 4, 8, 16, 32, 64 }
if (ntries >= HConstants.RETRY_BACKOFF.length) {
ntries = HConstants.RETRY_BACKOFF.length - 1;
}
long normalPause = pause * HConstants.RETRY_BACKOFF[ntries];
long jitter = (long)(normalPause * RANDOM.nextFloat() * 0.01f); // 1% possible jitter
return normalPause + jitter;
}
Retry interval strategy ZK:
// RetryCounter类
//Sleep time 指数级增长
public void sleepUntilNextRetry() throws InterruptedException {
int attempts = getAttemptTimes();
long sleepTime = (long) (retryIntervalMillis * Math.pow(2, attempts));
timeUnit.sleep(sleepTime);
}
// retriesRemaining, The default value ismaxReties, Each retry after reduction1
public int getAttemptTimes() {
return maxRetries-retriesRemaining+1;
}