【Redis核心知识 八】Redis集群之Cluster模式及集群搭建(二)

本文涉及的产品
Redis 开源版,标准版 2GB
推荐场景:
搭建游戏排行榜
云数据库 Tair(兼容Redis),内存型 2GB
日志服务 SLS,月写入数据量 50GB 1个月
简介: 【Redis核心知识 八】Redis集群之Cluster模式及集群搭建

启动集群,使用命令redis-trib.rb来启动集群

./redis-trib.rb create --replicas 1  192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380

结果启动时报如下内容:

[root@192 src]# ./redis-trib.rb create --replicas 1  192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380
WARNING: redis-trib.rb is not longer available!
You should use redis-cli instead.
All commands and features belonging to redis-trib.rb have been moved
to redis-cli.
In order to use them you should call redis-cli with the --cluster
option followed by the subcommand name, arguments and options.
Use the following syntax:
redis-cli --cluster SUBCOMMAND [ARGUMENTS] [OPTIONS]
Example:
redis-cli --cluster create 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 --cluster-replicas 1
To get help about all subcommands, type:
redis-cli --cluster help

启动集群并查看启动日志

实践出真知啊,Redis新版本集群启用已经不再依赖于Ruby了,直接用Redis-cli即可:

[root@192 src]# redis-cli --cluster create 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.5.102:6380 to 192.168.5.101:6379
Adding replica 192.168.5.103:6380 to 192.168.5.102:6379
Adding replica 192.168.5.101:6380 to 192.168.5.103:6379
M: 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379
   slots:[0-5460] (5461 slots) master
M: 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379
   slots:[5461-10922] (5462 slots) master
M: 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379
   slots:[10923-16383] (5461 slots) master
S: 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380
   replicates 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b
S: 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380
   replicates 94d9e1637bd1701b146e367ffa7a69e8c24566e8
S: bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380
   replicates 6c1005a89742e50db240775204c03ab3d7558e59
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 192.168.5.101:6379)
M: 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380
   slots: (0 slots) slave
   replicates 6c1005a89742e50db240775204c03ab3d7558e59
S: 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380
   slots: (0 slots) slave
   replicates 94d9e1637bd1701b146e367ffa7a69e8c24566e8
M: 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380
   slots: (0 slots) slave
   replicates 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b
M: 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
[root@192 src]#

可以看到各个的配置文件如下:

101机器上6379的node配置文件及服务器活动日志

bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605332446460 3 connected
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605332448477 2 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 master - 0 1605332448000 2 connected 5461-10922
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 slave 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 0 1605332446000 1 connected
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605332447469 3 connected 10923-16383
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 myself,master - 0 1605332447000 1 connected 0-5460
vars currentEpoch 6 lastVoteEpoch 0

同时可以看下该主节点在集群启动过程中的活动日志:

70362:M 14 Nov 2020 13:40:43.171 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH
70362:M 14 Nov 2020 13:40:43.211 # IP address for this node updated to 192.168.5.101
70362:M 14 Nov 2020 13:40:46.150 * Replica 192.168.5.102:6380 asks for synchronization
70362:M 14 Nov 2020 13:40:46.150 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '5c437fa82b0ca0caabfa2f0a17d15d9fc8f2548f', my replication IDs are 'd162ace4d79189e94f2a1f202c6fec4230a4189d' and '0000000000000000000000000000000000000000')
70362:M 14 Nov 2020 13:40:46.150 * Replication backlog created, my new replication IDs are '64bd8964903df1d43df728eed12e7c00956674f3' and '0000000000000000000000000000000000000000'
70362:M 14 Nov 2020 13:40:46.150 * Starting BGSAVE for SYNC with target: disk
70362:M 14 Nov 2020 13:40:46.151 * Background saving started by pid 85385
85385:C 14 Nov 2020 13:40:46.165 * DB saved on disk
85385:C 14 Nov 2020 13:40:46.167 * RDB: 0 MB of memory used by copy-on-write
70362:M 14 Nov 2020 13:40:46.258 * Background saving terminated with success
70362:M 14 Nov 2020 13:40:46.258 * Synchronization with replica 192.168.5.102:6380 succeeded
70362:M 14 Nov 2020 13:40:48.174 # Cluster state changed: ok

101机器上6380的node配置文件及服务器活动日志

6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605332445000 3 connected 10923-16383
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 myself,slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605332446000 3 connected
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605332445452 2 connected
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 slave 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 0 1605332446000 1 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 master - 0 1605332447480 2 connected 5461-10922
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 master - 0 1605332446468 1 connected 0-5460
vars currentEpoch 6 lastVoteEpoch 0

同时可以看下该从节点在集群启动过程中的活动日志:

70795:M 14 Nov 2020 13:40:43.173 # configEpoch set to 6 via CLUSTER SET-CONFIG-EPOCH
70795:M 14 Nov 2020 13:40:43.334 # IP address for this node updated to 192.168.5.101
70795:S 14 Nov 2020 13:40:45.182 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
70795:S 14 Nov 2020 13:40:45.182 # Cluster state changed: ok
70795:S 14 Nov 2020 13:40:45.551 * Connecting to MASTER 192.168.5.103:6379
70795:S 14 Nov 2020 13:40:45.551 * MASTER <-> REPLICA sync started
70795:S 14 Nov 2020 13:40:45.551 * Non blocking connect for SYNC fired the event.
70795:S 14 Nov 2020 13:40:45.552 * Master replied to PING, replication can continue...
70795:S 14 Nov 2020 13:40:45.552 * Trying a partial resynchronization (request de83e39b5538aec381a9b7cb415f0afd9e9fb69e:1).
70795:S 14 Nov 2020 13:40:45.553 * Full resync from master: 36c2011a9e000813ddbe35ea240fd347ad351a04:0
70795:S 14 Nov 2020 13:40:45.553 * Discarding previously cached master state.
70795:S 14 Nov 2020 13:40:45.634 * MASTER <-> REPLICA sync: receiving 175 bytes from master to disk
70795:S 14 Nov 2020 13:40:45.634 * MASTER <-> REPLICA sync: Flushing old data
70795:S 14 Nov 2020 13:40:45.659 * MASTER <-> REPLICA sync: Loading DB in memory
70795:S 14 Nov 2020 13:40:45.660 * Loading RDB produced by version 6.0.8
70795:S 14 Nov 2020 13:40:45.660 * RDB age 0 seconds
70795:S 14 Nov 2020 13:40:45.660 * RDB memory usage when created 2.46 Mb
70795:S 14 Nov 2020 13:40:45.660 * MASTER <-> REPLICA sync: Finished with success
70795:S 14 Nov 2020 13:40:45.661 * Background append only file rewriting started by pid 85384
70795:S 14 Nov 2020 13:40:45.700 * AOF rewrite child asks to stop sending diffs.
85384:C 14 Nov 2020 13:40:45.700 * Parent agreed to stop sending diffs. Finalizing AOF...
85384:C 14 Nov 2020 13:40:45.700 * Concatenating 0.00 MB of AOF diff received from parent.
85384:C 14 Nov 2020 13:40:45.700 * SYNC append only file rewrite performed
85384:C 14 Nov 2020 13:40:45.701 * AOF rewrite: 0 MB of memory used by copy-on-write
70795:S 14 Nov 2020 13:40:45.762 * Background AOF rewrite terminated with success
70795:S 14 Nov 2020 13:40:45.762 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
70795:S 14 Nov 2020 13:40:45.762 * Background AOF rewrite finished successfully

操作集群存取数据

我们在服务器的输出日志上可以看到多个插槽的输出,那么接下来看看数据是如何存取的:

[root@192 redis-6.0.8]# redis-cli -h 192.168.5.101
192.168.5.101:6379> set love guochengyu
(error) MOVED 16198 192.168.5.103:6379

可以看的出存储到了插槽16198,被路由到了master3上了,所以需要使用重定向的方式插入和获取,只需要直接在客户端命令后边加个-c即可:

[root@192 redis-6.0.8]#  redis-cli -h 192.168.5.101 -c
192.168.5.101:6379> set love guochenyubaobei
-> Redirected to slot [16198] located at 192.168.5.103:6379
OK
192.168.5.103:6379> get love
"guochenyubaobei"
192.168.5.103:6379>

集群的主从切换

主从切换分为两种,一种是主服务器下线,一种是从服务器下线。

小小的slave丢了无所谓

一个 从节点下线,可以从主的日志中看到,10秒连接不上,就下线,下线后再上线主向从节点复制数据。其它主从会记录整个集群状态:

关闭Redis的101master对应的从102slave

20189:signal-handler (1605334497) Received SIGINT scheduling shutdown...
20189:S 14 Nov 2020 14:14:57.915 # User requested shutdown...
20189:S 14 Nov 2020 14:14:57.915 * Calling fsync() on the AOF file.
20189:S 14 Nov 2020 14:14:57.915 * Saving the final RDB snapshot before exiting.
20189:S 14 Nov 2020 14:14:57.916 * DB saved on disk
20189:S 14 Nov 2020 14:14:57.916 # Redis is now ready to exit, bye bye...
[root@192 config]#

其对应的101master节点lost了自己的从节点:

70362:M 14 Nov 2020 14:14:57.919 # Connection with replica 192.168.5.102:6380 lost.
70362:M 14 Nov 2020 14:15:10.105 * FAIL message received from 6c1005a89742e50db240775204c03ab3d7558e59 about 13ab2f0291cea595f51d6efac60c3e62278e64cb

而这个消息是101master节点从103master节点那儿听说的,所以103主节点上第一次标记了信息:

9496:M 14 Nov 2020 14:15:10.099 * Marking node 13ab2f0291cea595f51d6efac60c3e62278e64cb as failing (quorum reached).

其它几个节点也收到了同样的通知。

重启Redis的101master对应的从102slave

当把该102从节点【slave】上线的时候,所有节点收到通知取消了标记:

9496:M 14 Nov 2020 14:23:29.425 * Clear FAIL state for node 13ab2f0291cea595f51d6efac60c3e62278e64cb: replica is reachable again.

同时102从节点【slave】对应的master还负责一些信息的主从同步:

70362:M 14 Nov 2020 14:23:29.463 * Clear FAIL state for node 13ab2f0291cea595f51d6efac60c3e62278e64cb: replica is reachable again.
70362:M 14 Nov 2020 14:23:30.381 * Replica 192.168.5.102:6380 asks for synchronization
70362:M 14 Nov 2020 14:23:30.381 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'eeea1b2c358998aacc3dcf525484245183fcb9ab', my replication IDs are '64bd8964903df1d43df728eed12e7c00956674f3' and '0000000000000000000000000000000000000000')
70362:M 14 Nov 2020 14:23:30.381 * Starting BGSAVE for SYNC with target: disk
70362:M 14 Nov 2020 14:23:30.382 * Background saving started by pid 86035
86035:C 14 Nov 2020 14:23:30.386 * DB saved on disk
86035:C 14 Nov 2020 14:23:30.388 * RDB: 0 MB of memory used by copy-on-write
70362:M 14 Nov 2020 14:23:30.470 * Background saving terminated with success
70362:M 14 Nov 2020 14:23:30.470 * Synchronization with replica 192.168.5.102:6380 succeeded

然后102从节点【slave】自然而然的需要从主同步数据。

21392:M 14 Nov 2020 14:23:29.366 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
21392:M 14 Nov 2020 14:23:29.366 # Server initialized
21392:M 14 Nov 2020 14:23:29.366 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
21392:M 14 Nov 2020 14:23:29.366 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
21392:M 14 Nov 2020 14:23:29.367 * Reading RDB preamble from AOF file...
21392:M 14 Nov 2020 14:23:29.367 * Loading RDB produced by version 6.0.8
21392:M 14 Nov 2020 14:23:29.367 * RDB age 2563 seconds
21392:M 14 Nov 2020 14:23:29.367 * RDB memory usage when created 2.43 Mb
21392:M 14 Nov 2020 14:23:29.367 * RDB has an AOF tail
21392:M 14 Nov 2020 14:23:29.367 * Reading the remaining AOF tail...
21392:M 14 Nov 2020 14:23:29.367 * DB loaded from append only file: 0.000 seconds
21392:M 14 Nov 2020 14:23:29.367 * Ready to accept connections
21392:S 14 Nov 2020 14:23:29.367 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
21392:S 14 Nov 2020 14:23:29.367 # Cluster state changed: ok
21392:S 14 Nov 2020 14:23:30.374 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:23:30.374 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:23:30.375 * Non blocking connect for SYNC fired the event.
21392:S 14 Nov 2020 14:23:30.377 * Master replied to PING, replication can continue...
21392:S 14 Nov 2020 14:23:30.379 * Trying a partial resynchronization (request eeea1b2c358998aacc3dcf525484245183fcb9ab:1).
21392:S 14 Nov 2020 14:23:30.381 * Full resync from master: 64bd8964903df1d43df728eed12e7c00956674f3:2856
21392:S 14 Nov 2020 14:23:30.381 * Discarding previously cached master state.
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: receiving 176 bytes from master to disk
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Flushing old data
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Loading DB in memory
21392:S 14 Nov 2020 14:23:30.469 * Loading RDB produced by version 6.0.8
21392:S 14 Nov 2020 14:23:30.469 * RDB age 0 seconds
21392:S 14 Nov 2020 14:23:30.469 * RDB memory usage when created 2.45 Mb
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Finished with success
21392:S 14 Nov 2020 14:23:30.469 * Background append only file rewriting started by pid 21396
21392:S 14 Nov 2020 14:23:30.519 * AOF rewrite child asks to stop sending diffs.
21396:C 14 Nov 2020 14:23:30.520 * Parent agreed to stop sending diffs. Finalizing AOF...
21396:C 14 Nov 2020 14:23:30.520 * Concatenating 0.00 MB of AOF diff received from parent.
21396:C 14 Nov 2020 14:23:30.520 * SYNC append only file rewrite performed
21396:C 14 Nov 2020 14:23:30.520 * AOF rewrite: 0 MB of memory used by copy-on-write
21392:S 14 Nov 2020 14:23:30.576 * Background AOF rewrite terminated with success
21392:S 14 Nov 2020 14:23:30.576 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
21392:S 14 Nov 2020 14:23:30.576 * Background AOF rewrite finished successfully

王朝更迭,master变slave

接下来进入年度大戏,王朝更迭,老master一段时间失联,被slave谋朝篡位,当它再回来的时候,只能屈居为一个小小的slave。一个 主节点下线,从节点尝试重连,连到10秒【10次】,认为主节点失败,自己申请成为主节点,主重新连接后成为了slave,已经被改朝换代了。其它主从会记录整个集群状态。

关闭Redis的101master

关闭后自己的从节点102salve试图去按照配置连接主节点,但是连不上:

21392:S 14 Nov 2020 14:30:21.427 # Connection with master lost.
21392:S 14 Nov 2020 14:30:21.427 * Caching the disconnected master state.
21392:S 14 Nov 2020 14:30:21.500 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:21.500 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:21.500 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:22.508 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:22.509 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:22.509 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:23.517 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:23.517 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:23.517 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:24.525 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:24.525 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:24.526 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:25.530 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:25.530 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:25.530 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:26.538 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:26.538 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:26.538 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:27.546 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:27.546 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:27.546 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:28.556 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:28.556 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:28.556 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:29.566 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:29.566 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:29.567 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:30.574 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:30.574 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:30.575 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:31.580 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:31.580 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:31.580 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:32.588 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:32.588 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:32.588 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:33.596 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:33.596 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:33.596 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:34.246 * FAIL message received from 6c1005a89742e50db240775204c03ab3d7558e59 about 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b
21392:S 14 Nov 2020 14:30:34.246 # Cluster state changed: fail
21392:S 14 Nov 2020 14:30:34.304 # Start of election delayed for 792 milliseconds (rank #0, offset 3430).
21392:S 14 Nov 2020 14:30:34.605 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:34.605 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:34.606 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:35.108 # Starting a failover election for epoch 7.
21392:S 14 Nov 2020 14:30:35.110 # Failover election won: I'm the new master.
21392:S 14 Nov 2020 14:30:35.110 # configEpoch set to 7 after successful failover
21392:M 14 Nov 2020 14:30:35.110 * Discarding previously cached master state.
21392:M 14 Nov 2020 14:30:35.110 # Setting secondary replication ID to 64bd8964903df1d43df728eed12e7c00956674f3, valid up to offset: 3431. New replication ID is 3088efe0ca743a931b3863cbb0fd673811c31b7a
21392:M 14 Nov 2020 14:30:35.110 # Cluster state changed: ok

于是谋朝篡位将自己设置为主,同时其他节点也都收到了通知。我们查看集群信息如下:

[root@192 redis-6.0.8]#  redis-cli -h 192.168.5.102 -c                    
192.168.5.102:6379> cluster nodes
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 master - 0 1605335617797 7 connected 0-5460
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605335615777 2 connected
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 master,fail - 1605335422306 1605335419000 1 disconnected
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605335614000 3 connected 10923-16383
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605335616788 3 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 myself,master - 0 1605335615000 2 connected 5461-10922
192.168.5.102:6379>

可以看到一个失败的master和一个篡位成功的master

重启Redis的101master,奥不,现在是101slave,作为102master的奴隶

虽然重新启动了,但是只能作为一个slave存在了,所以启动后它需要从主同步数据

86216:M 14 Nov 2020 14:35:39.278 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
86216:M 14 Nov 2020 14:35:39.278 # Server initialized
86216:M 14 Nov 2020 14:35:39.278 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
86216:M 14 Nov 2020 14:35:39.278 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
86216:M 14 Nov 2020 14:35:39.278 * Ready to accept connections
86216:M 14 Nov 2020 14:35:39.279 # Configuration change detected. Reconfiguring myself as a replica of 13ab2f0291cea595f51d6efac60c3e62278e64cb
86216:S 14 Nov 2020 14:35:39.279 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
86216:S 14 Nov 2020 14:35:39.279 # Cluster state changed: ok
86216:S 14 Nov 2020 14:35:40.285 * Connecting to MASTER 192.168.5.102:6380
86216:S 14 Nov 2020 14:35:40.285 * MASTER <-> REPLICA sync started
86216:S 14 Nov 2020 14:35:40.285 * Non blocking connect for SYNC fired the event.
86216:S 14 Nov 2020 14:35:40.286 * Master replied to PING, replication can continue...
86216:S 14 Nov 2020 14:35:40.287 * Trying a partial resynchronization (request 647cca874c4c83a50d4fe5f82690eb51df36aa6d:1).
86216:S 14 Nov 2020 14:35:40.287 * Full resync from master: 3088efe0ca743a931b3863cbb0fd673811c31b7a:3430
86216:S 14 Nov 2020 14:35:40.287 * Discarding previously cached master state.
86216:S 14 Nov 2020 14:35:40.307 * MASTER <-> REPLICA sync: receiving 176 bytes from master to disk
86216:S 14 Nov 2020 14:35:40.307 * MASTER <-> REPLICA sync: Flushing old data
86216:S 14 Nov 2020 14:35:40.309 * MASTER <-> REPLICA sync: Loading DB in memory
86216:S 14 Nov 2020 14:35:40.309 * Loading RDB produced by version 6.0.8
86216:S 14 Nov 2020 14:35:40.309 * RDB age 0 seconds
86216:S 14 Nov 2020 14:35:40.309 * RDB memory usage when created 2.45 Mb
86216:S 14 Nov 2020 14:35:40.309 * MASTER <-> REPLICA sync: Finished with success
86216:S 14 Nov 2020 14:35:40.309 * Background append only file rewriting started by pid 86220
86216:S 14 Nov 2020 14:35:40.349 * AOF rewrite child asks to stop sending diffs.
86220:C 14 Nov 2020 14:35:40.349 * Parent agreed to stop sending diffs. Finalizing AOF...
86220:C 14 Nov 2020 14:35:40.349 * Concatenating 0.00 MB of AOF diff received from parent.
86220:C 14 Nov 2020 14:35:40.350 * SYNC append only file rewrite performed
86220:C 14 Nov 2020 14:35:40.350 * AOF rewrite: 0 MB of memory used by copy-on-write
86216:S 14 Nov 2020 14:35:40.387 * Background AOF rewrite terminated with success
86216:S 14 Nov 2020 14:35:40.387 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
86216:S 14 Nov 2020 14:35:40.387 * Background AOF rewrite finished successfully

再次查看集群状态:

192.168.5.102:6379> cluster nodes
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 master - 0 1605335837000 7 connected 0-5460
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605335837414 2 connected
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 slave 13ab2f0291cea595f51d6efac60c3e62278e64cb 0 1605335837000 7 connected
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605335837000 3 connected 10923-16383
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605335838422 3 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 myself,master - 0 1605335836000 2 connected 5461-10922
192.168.5.102:6379>


相关文章
|
2月前
|
存储 负载均衡 NoSQL
【赵渝强老师】Redis Cluster分布式集群
Redis Cluster是Redis的分布式存储解决方案,通过哈希槽(slot)实现数据分片,支持水平扩展,具备高可用性和负载均衡能力,适用于大规模数据场景。
198 2
|
7天前
|
存储 监控 NoSQL
Redis高可用架构全解析:从主从复制到集群方案
Redis高可用确保服务持续稳定,避免单点故障导致数据丢失或业务中断。通过主从复制实现数据冗余,哨兵模式支持自动故障转移,Cluster集群则提供分布式数据分片与水平扩展,三者层层递进,保障读写分离、容灾切换与大规模数据存储,构建高性能、高可靠的Redis架构体系。
|
16天前
|
存储 运维 NoSQL
Redis集群模式
Redis集群是一种分布式存储方案,旨在解决数据存储容量不足的问题。它通过将数据分片存储在多个节点上,实现数据的横向扩展。常见的分片算法包括哈希求余、一致性哈希和哈希槽分区。其中,Redis采用哈希槽分区算法,将数据均匀分配到16384个槽位中,每个分片负责一部分槽位。当节点故障时,集群通过故障检测和主从切换机制,确保服务的高可用性。集群还支持自动的数据迁移和负载均衡,保障系统稳定运行。
|
2月前
|
存储 NoSQL 算法
Redis的集群架构与使用经验
本文介绍了Redis的集群架构与使用经验,包括主从复制、哨兵集群及Cluster分片集群的应用场景与实现原理。内容涵盖Redis主从同步机制、数据分片存储方式、事务支持及与Memcached的区别,并讨论了Redis内存用尽时的处理策略。适用于了解Redis高可用与性能优化方案。
|
3月前
|
负载均衡 NoSQL Redis
【赵渝强老师】Redis的主从复制集群
Redis主从复制是指将一台Redis服务器的数据复制到其他Redis服务器,实现数据热备份、故障恢复、负载均衡及高可用架构的基础。主节点负责写操作,从节点同步数据并可提供读服务,提升并发处理能力。
|
5月前
|
缓存 NoSQL 关系型数据库
美团面试:MySQL有1000w数据,redis只存20w的数据,如何做 缓存 设计?
美团面试:MySQL有1000w数据,redis只存20w的数据,如何做 缓存 设计?
美团面试:MySQL有1000w数据,redis只存20w的数据,如何做 缓存 设计?
|
16天前
|
存储 缓存 NoSQL
Redis专题-实战篇二-商户查询缓存
本文介绍了缓存的基本概念、应用场景及实现方式,涵盖Redis缓存设计、缓存更新策略、缓存穿透问题及其解决方案。重点讲解了缓存空对象与布隆过滤器的使用,并通过代码示例演示了商铺查询的缓存优化实践。
100 1
Redis专题-实战篇二-商户查询缓存
|
5月前
|
缓存 NoSQL Java
Redis+Caffeine构建高性能二级缓存
大家好,我是摘星。今天为大家带来的是Redis+Caffeine构建高性能二级缓存,废话不多说直接开始~
726 0
|
16天前
|
缓存 NoSQL 关系型数据库
Redis缓存和分布式锁
Redis 是一种高性能的键值存储系统,广泛用于缓存、消息队列和内存数据库。其典型应用包括缓解关系型数据库压力,通过缓存热点数据提高查询效率,支持高并发访问。此外,Redis 还可用于实现分布式锁,解决分布式系统中的资源竞争问题。文章还探讨了缓存的更新策略、缓存穿透与雪崩的解决方案,以及 Redlock 算法等关键技术。
|
5月前
|
消息中间件 缓存 NoSQL
基于Spring Data Redis与RabbitMQ实现字符串缓存和计数功能(数据同步)
总的来说,借助Spring Data Redis和RabbitMQ,我们可以轻松实现字符串缓存和计数的功能。而关键的部分不过是一些"厨房的套路",一旦你掌握了这些套路,那么你就像厨师一样可以准备出一道道饕餮美食了。通过这种方式促进数据处理效率无疑将大大提高我们的生产力。
191 32

热门文章

最新文章