转载请注明出处哈:http://carlosfu.iteye.com/blog/2240426
一、现象:
我们的redis私有云,对外提供了redis-standalone, redis-sentinel, redis-cluster三种类型的redis服务。
其中redis-cluster, 使用的版本是 Redis Cluster 3.0.2, 客户端是jedis 2.7.2。
有人在使用时候,业务的日志中发现了一些异常(Too many Cluster redirections)。
二、jedis源码分析:
先从jedis源码中找到这个异常,这段异常是在JedisClusterCommand类中
if (redirections <= 0) { throw new JedisClusterMaxRedirectionsException("Too many Cluster redirections? key=" + key); }
在jedis中调用redis-cluster使用的JedisCluster类,所有api的调用方式类似如下:
public String set(final String key, final String value) { return new JedisClusterCommand(connectionHandler, maxRedirections) { @Override public String execute(Jedis connection) { return connection.set(key, value); } }.run(key); }
所以重点代码在JedisClusterCommand这个类里,重要代码如下:
public T run(int keyCount, String... keys) { if (keys == null || keys.length == 0) { throw new JedisClusterException("No way to dispatch this command to Redis Cluster."); } if (keys.length > 1) { int slot = JedisClusterCRC16.getSlot(keys[0]); for (int i = 1; i < keyCount; i++) { int nextSlot = JedisClusterCRC16.getSlot(keys[i]); if (slot != nextSlot) { throw new JedisClusterException("No way to dispatch this command to Redis Cluster " + "because keys have different slots."); } } } return runWithRetries(SafeEncoder.encode(keys[0]), this.redirections, false, false); } private T runWithRetries(byte[] key, int redirections, boolean tryRandomNode, boolean asking) { if (redirections <= 0) { JedisClusterMaxRedirectionsException exception = new JedisClusterMaxRedirectionsException( "Too many Cluster redirections? key=" + SafeEncoder.encode(key)); throw exception; } Jedis connection = null; try { if (asking) { // TODO: Pipeline asking with the original command to make it // faster.... connection = askConnection.get(); connection.asking(); // if asking success, reset asking flag asking = false; } else { if (tryRandomNode) { connection = connectionHandler.getConnection(); } else { connection = connectionHandler.getConnectionFromSlot(JedisClusterCRC16.getSlot(key)); } } return execute(connection); } catch (JedisConnectionException jce) { if (tryRandomNode) { // maybe all connection is down throw jce; } // release current connection before recursion releaseConnection(connection); connection = null; // retry with random connection return runWithRetries(key, redirections - 1, true, asking); } catch (JedisRedirectionException jre) { // if MOVED redirection occurred, if (jre instanceof JedisMovedDataException) { // it rebuilds cluster's slot cache // recommended by Redis cluster specification this.connectionHandler.renewSlotCache(connection); } // release current connection before recursion or renewing releaseConnection(connection); connection = null; if (jre instanceof JedisAskDataException) { asking = true; askConnection.set(this.connectionHandler.getConnectionFromNode(jre.getTargetNode())); } else if (jre instanceof JedisMovedDataException) { } else { throw new JedisClusterException(jre); } return runWithRetries(key, redirections - 1, false, asking); } finally { releaseConnection(connection); } }
代码解释:
1. 所有jedis.set这样的调用,都用JedisClusterCommand包装起来(模板方法)
2. 如果操作的是多个不同的key, 会抛出如下异常,因为redis-cluster不支持key的批量操作(可以通过其他方法解决,以后会介绍):
throw new JedisClusterException("No way to dispatch this command to Redis Cluster because keys have different slots.");
3. 参数解释
private T runWithRetries(byte[] key, int redirections, boolean tryRandomNode, boolean asking) {
(1) key: 不多说了
(2) redirections: 节点调转次数(实际可以看做是重试次数)
(3) tryRandomNode: 是否从redis cluster随机选一个节点进行操作
(4) asking: 是否发生了asking问题
4. 逻辑说明:
正常逻辑:
(1) asking = true: 获取asking对应的jedis, 然后用这个jedis操作。
(2) tryRandomNode= true: 从jedis连接池随机获取一个可用的jedis, 然后用这个jedis操作。
(3) 都不是:直接用key->slot->jedis,直接找到key对应的jedis, 然后用这个jedis操作。
异常逻辑:
(1) JedisConnectionException: 连接出了问题,连接断了、超时等等,tryRandomNode= true,递归调用本函数
(2) JedisRedirectionException分两种:
---JedisMovedDataException: 节点迁移了,重新renew本地slot对redis节点的对应Map
---JedisAskDataException: 数据迁移,发生asking问题, 获取asking的Jedis
此过程最多循环redirections次。
异常含义:试了redirections次(上述),仍然没有完成redis操作。
三、原因猜测:
1. 超时比较多,默认超时时间是2秒。
(1). 网络原因:比如是否存在跨机房、网络割接等等。
(2). 慢查询,因为redis是单线程,如果有慢查询的话,会阻塞住之后的操作。
(3). value值过大?
(4). aof 重写/rdb fork发生?
2. 节点之间关系发生变化,会发生JedisMovedDataException
3. 数据在节点之间迁移,会发生JedisAskDataException
四、定位方法:
看了一下redis的日志第三节中的2,3并未发生,应该是超时的情况。
异常发生在2015-10-19 04:34:30左右,给出如下异常key值
key=v11Pay|huid|wlunm99_561555097 key=play_anchorroom_info_529460 key=v11Pay|huid|qq-qhncnxujax key=play_anchor_info_qq-luzvfcftnf key=play_anchor_info_qq-luzvfcftnf key=play_anchorroom_info_550649 key=play_anchor_info_qq-cfrkukhdsd key=play_anchor_info_qq-rbufgcqbvk
经过查询,这些key都同时定位在一个redis实例上,于是看了一下这个redis实例的日志(与异常时间点对应),发现如下:AOF fsync发生了异常,以经验看是本地IO使用较大造成的。
17932:M 19 Oct 04:35:30.010 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17932:M 19 Oct 04:35:41.087 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17932:M 19 Oct 04:35:47.044 * Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis. 17932:M 19 Oct 10:15:51.463 * Starting automatic rewriting of AOF on 1795% growth
看了一下tsar的历史记录:tsar --io -n 2 | head -200
Time rrqms wrqms rs ws rsecs wsecs rqsize qusize await svctm util 19/10/15-04:00 0.00 164.08 0.01 34.52 0.04 745.15 21.58 0.00 25.30 4.13 14.26 19/10/15-04:05 40.38 1.1K 218.49 78.39 13.9K 4.9K 64.55 7.00 24.63 2.80 83.19 19/10/15-04:10 37.15 1.0K 360.58 71.91 13.4K 4.3K 42.04 6.00 14.67 1.70 73.34 19/10/15-04:15 1.99 1.5K 21.98 115.38 588.69 6.6K 53.12 5.00 39.86 1.98 27.14 19/10/15-04:20 40.17 1.0K 278.00 76.79 10.4K 4.2K 42.32 4.00 11.48 1.60 56.85 19/10/15-04:25 78.28 861.13 381.34 62.33 14.3K 3.6K 41.40 4.00 9.85 1.51 66.78 19/10/15-04:30 81.64 913.85 402.37 55.35 15.1K 3.8K 42.18 4.00 9.47 1.41 64.71 19/10/15-04:35 21.92 888.72 145.97 58.00 16.2K 3.7K 99.71 4.00 20.57 3.63 74.04 19/10/15-04:40 39.72 474.01 169.01 48.26 14.3K 2.0K 77.09 3.00 17.83 4.14 89.89 19/10/15-04:45 47.02 537.60 149.41 41.50 16.7K 2.3K 101.55 3.00 18.27 4.21 80.35
于是发现从4点开始IO开销一直增大,以经验看应该是有定时任务(都是托管的机器,上面还有别人的应用),于是发现了如下,是一个nginx合并的脚本,本地IO开销较大。
00 04 * * * sh /opt/script/logcron.sh 00 04 * * * sh /opt/script/logremove.sh
五、解决方法:
(1) 和使用方沟通一下,他们完全把redis当做memcache用,也就是允许断电后数据丢失,重新从数据源获取数据写到缓存,因此关闭了aof配置(此方法不治本)
(2) 定时脚本下线或者优化。(最终采用此方法)