问题描述
在正常使用Azure Redis的服务中,突然发现Redis 的CPU达到了100%, 正常的使用中发现性能问题严重。从Redis的门户图表中,观察到CPU, Connection,Lentency,Server Load都出现高的情况
CPU
|
Server Load
|
Lentency(Redis处理请求的延迟情况)
|
Connection(客户端连接数情况)
|
问题分析
根据以上的图表发现,在Connection高的情况下导致了CPU上升并持续到100%, 引起了Server Load比较高(有时达到100%),从而导致在整个Redis的延迟升高。
在Redis创建一个新的连接时,是非常昂贵的高消耗操作。同时在Azure中,每一种定价级别的Redis都有连接数的限制,如果达到限制都会导致性能问题。而根据Reids的官方文档说明,当前连接数就是因为超过了最大上限。
如基本层的连接数限制为(https://www.azure.cn/pricing/details/cache/)
缓存名称 | 缓存大小 | 基本 | 网络性能 | 客户端连接数量 |
C0 | 250 MB | ¥0.14片/节点/小时(约¥104.16 /月) | 低 | 256 |
C1 | 1 GB | ¥0.35片/节点/小时(约¥260.40 /月) | 中等 | 1000 |
C2 | 2.5 GB | ¥0.57片/节点/小时(约¥424.08 /月) | 中等 | 2000 |
C3 | 6 GB | ¥1.14片/节点/小时(约 ¥848.16 /月) | 高 | 5000 |
C4 | 13 GB | ¥1.33片/节点/小时(约 ¥989.52 /月) | 中等 | 10000 |
C5 | 26 GB | ¥2.66片/节点/小时(约 ¥1979.04 /月) | 高 | 15000 |
C6 | 53 GB | ¥5.31片/节点/小时(约¥3950.64 /月) | 最高 | 20000 |
而在处理请求的并发和性能上,可以参考(https://docs.azure.cn/zh-cn/azure-cache-for-redis/cache-planning-faq#azure-cache-for-redis-performance):
定价层 | 大小 | CPU 核心数 | 可用带宽 | 1 KB 值大小 | 1 KB 值大小 |
标准缓存大小 | 兆位/秒(Mb/秒)/兆字节/秒(MB/秒) | 非 SSL 请求数/秒 (RPS) | SSL 请求数/秒 (RPS) | ||
C0 | 250 MB | 共享 | 100/12.5 | 15,000 | 7,500 |
C1 | 1 GB | 1 | 500/62.5 | 38,000 | 20,720 |
C2 | 2.5 GB | 2 | 500/62.5 | 41,000 | 37,000 |
C3 | 6 GB | 4 | 1000/125 | 100,000 | 90,000 |
C4 | 13 GB | 2 | 500/62.5 | 60,000 | 55,000 |
C5 | 26 GB | 4 | 1,000 / 125 | 102,000 | 93,000 |
C6 | 53 GB | 8 | 2,000 / 250 | 126,000 | 120,000 |
解决办法
使用连接池来改善连接问题, 如JedisPool,可以参考GitHub中的说明:https://gist.github.com/JonCole/925630df72be1351b21440625ff2671f#use-jedispool, (如链接不可访问,可以点击[转] Azure Redis Best Practices - Java)参考示例代码:
9 private static JedisPool pool; 15 private static JedisPoolConfig config; 16 17 // Should be called exactly once during App Startup logic. 18 public static void initializeSettings(String host, int port, String password, int connectTimeout, int operationTimeout) { 19 Redis.host = host; 20 Redis.port = port; 21 Redis.password = password; 22 Redis.connectTimeout = connectTimeout; 23 Redis.operationTimeout = operationTimeout; 24 } 25 26 // MAKE SURE to call the initializeSettings method first 27 public static JedisPool getPoolInstance() { 28 if (pool == null) { // avoid synchronization lock if initialization has already happened 29 synchronized(staticLock) { 30 if (pool == null) { // don't re-initialize if another thread beat us to it. 31 JedisPoolConfig poolConfig = getPoolConfig(); 32 boolean useSsl = port == 6380 ? true : false; 33 int db = 0; 34 String clientName = "MyClientName"; // null means use default 35 SSLSocketFactory sslSocketFactory = null; // null means use default 36 SSLParameters sslParameters = null; // null means use default 37 HostnameVerifier hostnameVerifier = new SimpleHostNameVerifier(host); 38 pool = new JedisPool(poolConfig, host, port, connectTimeout,operationTimeout,password, db, 39 clientName, useSsl, sslSocketFactory, sslParameters, hostnameVerifier); 40 } 41 } 42 } 43 return pool; 44 } 45 46 public static JedisPoolConfig getPoolConfig() { 47 if (config == null) { 48 JedisPoolConfig poolConfig = new JedisPoolConfig(); 49 50 // Each thread trying to access Redis needs its own Jedis instance from the pool. 51 // Using too small a value here can lead to performance problems, too big and you have wasted resources. 52 int maxConnections = 200; 53 poolConfig.setMaxTotal(maxConnections); 54 poolConfig.setMaxIdle(maxConnections); 55 56 // Using "false" here will make it easier to debug when your maxTotal/minIdle/etc settings need adjusting. 57 // Setting it to "true" will result better behavior when unexpected load hits in production 58 poolConfig.setBlockWhenExhausted(true); 59 60 // How long to wait before throwing when pool is exhausted 61 poolConfig.setMaxWaitMillis(operationTimeout); 62 63 // This controls the number of connections that should be maintained for bursts of load. 64 // Increase this value when you see pool.getResource() taking a long time to complete under burst scenarios 65 poolConfig.setMinIdle(50); 66 67 Redis.config = poolConfig; 68 } 69 70 return config; 71 } 114 }
如PHP, NodeJS, ASP.Net的代码,都可以在以上的GitHub中找到示例代码。
同时,也非常推荐在代码中使用Lazy模式。
Reconnecting with
Lazy<T>
patternWe have seen a few rare cases where StackExchange.Redis fails to reconnect after a connection blip (for example, due to patching). Restarting the client or creating a new ConnectionMultiplexer will fix the issue. Here is some sample code that still uses the recommended
Lazy<ConnectionMultiplexer>
pattern while allowing apps to force a reconnection periodically. Make sure to update code calling into the ConnectionMultiplexer so that they handle anyObjectDisposedException
errors that occur as a result of disposing the old one.
1 using System; 2 using System.Threading; 3 using StackExchange.Redis; 4 5 static class Redis 6 { 21 static string connectionString = "TODO: CALL InitializeConnectionString() method with connection string"; 22 static Lazy<ConnectionMultiplexer> multiplexer = CreateMultiplexer(); 23 24 public static ConnectionMultiplexer Connection { get { return multiplexer.Value; } } 25 93 94 private static Lazy<ConnectionMultiplexer> CreateMultiplexer() 95 { 96 return new Lazy<ConnectionMultiplexer>(() => ConnectionMultiplexer.Connect(connectionString)); 97 }
参考资料
Redis Best Practices: https://gist.github.com/JonCole/925630df72be1351b21440625ff2671f#reuse-connections (如链接不可访问,可以点击[转]Azure Redis Best Practices - PHP)
Azure Redis 缓存性能: https://docs.azure.cn/zh-cn/azure-cache-for-redis/cache-planning-faq#azure-cache-for-redis-performance
Reuse Connections: https://gist.github.com/JonCole/925630df72be1351b21440625ff2671f#php
Reuse Connections
The most common problem we have seen with PHP clients is that they either don't support persistent connections or the ability to reuse connections is disabled by default. When you don't reuse connections, it means that you have to pay the cost of establishing a new connection, including the SSL/TLS handshake, each time you want to send a request. This can add a lot of latency to your request time and will manifest itself as a performance problem in your application. Additionally, if you have a high request rate, this can cause significant CPU churn on both the Redis client-side and server-side, which can result in other issues.
As an example, the Predis Redis client has a
"persistent"
connection property that is false by default. Setting the"persistent"
property to true will should improve behavior drastically.