【Redis核心知识 八】Redis集群之Cluster模式及集群搭建(二)

本文涉及的产品
云数据库 Redis 版,社区版 2GB
推荐场景:
搭建游戏排行榜
简介: 【Redis核心知识 八】Redis集群之Cluster模式及集群搭建

启动集群,使用命令redis-trib.rb来启动集群

./redis-trib.rb create --replicas 1  192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380

结果启动时报如下内容:

[root@192 src]# ./redis-trib.rb create --replicas 1  192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380
WARNING: redis-trib.rb is not longer available!
You should use redis-cli instead.
All commands and features belonging to redis-trib.rb have been moved
to redis-cli.
In order to use them you should call redis-cli with the --cluster
option followed by the subcommand name, arguments and options.
Use the following syntax:
redis-cli --cluster SUBCOMMAND [ARGUMENTS] [OPTIONS]
Example:
redis-cli --cluster create 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 --cluster-replicas 1
To get help about all subcommands, type:
redis-cli --cluster help

启动集群并查看启动日志

实践出真知啊,Redis新版本集群启用已经不再依赖于Ruby了,直接用Redis-cli即可:

[root@192 src]# redis-cli --cluster create 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.5.102:6380 to 192.168.5.101:6379
Adding replica 192.168.5.103:6380 to 192.168.5.102:6379
Adding replica 192.168.5.101:6380 to 192.168.5.103:6379
M: 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379
   slots:[0-5460] (5461 slots) master
M: 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379
   slots:[5461-10922] (5462 slots) master
M: 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379
   slots:[10923-16383] (5461 slots) master
S: 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380
   replicates 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b
S: 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380
   replicates 94d9e1637bd1701b146e367ffa7a69e8c24566e8
S: bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380
   replicates 6c1005a89742e50db240775204c03ab3d7558e59
Can I set the above configuration? (type 'yes' to accept): yes
>>> Nodes configuration updated
>>> Assign a different config epoch to each node
>>> Sending CLUSTER MEET messages to join the cluster
Waiting for the cluster to join
.
>>> Performing Cluster Check (using node 192.168.5.101:6379)
M: 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379
   slots:[0-5460] (5461 slots) master
   1 additional replica(s)
S: bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380
   slots: (0 slots) slave
   replicates 6c1005a89742e50db240775204c03ab3d7558e59
S: 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380
   slots: (0 slots) slave
   replicates 94d9e1637bd1701b146e367ffa7a69e8c24566e8
M: 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379
   slots:[5461-10922] (5462 slots) master
   1 additional replica(s)
S: 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380
   slots: (0 slots) slave
   replicates 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b
M: 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379
   slots:[10923-16383] (5461 slots) master
   1 additional replica(s)
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
[root@192 src]#

可以看到各个的配置文件如下:

101机器上6379的node配置文件及服务器活动日志

bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605332446460 3 connected
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605332448477 2 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 master - 0 1605332448000 2 connected 5461-10922
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 slave 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 0 1605332446000 1 connected
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605332447469 3 connected 10923-16383
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 myself,master - 0 1605332447000 1 connected 0-5460
vars currentEpoch 6 lastVoteEpoch 0

同时可以看下该主节点在集群启动过程中的活动日志:

70362:M 14 Nov 2020 13:40:43.171 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH
70362:M 14 Nov 2020 13:40:43.211 # IP address for this node updated to 192.168.5.101
70362:M 14 Nov 2020 13:40:46.150 * Replica 192.168.5.102:6380 asks for synchronization
70362:M 14 Nov 2020 13:40:46.150 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '5c437fa82b0ca0caabfa2f0a17d15d9fc8f2548f', my replication IDs are 'd162ace4d79189e94f2a1f202c6fec4230a4189d' and '0000000000000000000000000000000000000000')
70362:M 14 Nov 2020 13:40:46.150 * Replication backlog created, my new replication IDs are '64bd8964903df1d43df728eed12e7c00956674f3' and '0000000000000000000000000000000000000000'
70362:M 14 Nov 2020 13:40:46.150 * Starting BGSAVE for SYNC with target: disk
70362:M 14 Nov 2020 13:40:46.151 * Background saving started by pid 85385
85385:C 14 Nov 2020 13:40:46.165 * DB saved on disk
85385:C 14 Nov 2020 13:40:46.167 * RDB: 0 MB of memory used by copy-on-write
70362:M 14 Nov 2020 13:40:46.258 * Background saving terminated with success
70362:M 14 Nov 2020 13:40:46.258 * Synchronization with replica 192.168.5.102:6380 succeeded
70362:M 14 Nov 2020 13:40:48.174 # Cluster state changed: ok

101机器上6380的node配置文件及服务器活动日志

6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605332445000 3 connected 10923-16383
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 myself,slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605332446000 3 connected
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605332445452 2 connected
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 slave 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 0 1605332446000 1 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 master - 0 1605332447480 2 connected 5461-10922
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 master - 0 1605332446468 1 connected 0-5460
vars currentEpoch 6 lastVoteEpoch 0

同时可以看下该从节点在集群启动过程中的活动日志:

70795:M 14 Nov 2020 13:40:43.173 # configEpoch set to 6 via CLUSTER SET-CONFIG-EPOCH
70795:M 14 Nov 2020 13:40:43.334 # IP address for this node updated to 192.168.5.101
70795:S 14 Nov 2020 13:40:45.182 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
70795:S 14 Nov 2020 13:40:45.182 # Cluster state changed: ok
70795:S 14 Nov 2020 13:40:45.551 * Connecting to MASTER 192.168.5.103:6379
70795:S 14 Nov 2020 13:40:45.551 * MASTER <-> REPLICA sync started
70795:S 14 Nov 2020 13:40:45.551 * Non blocking connect for SYNC fired the event.
70795:S 14 Nov 2020 13:40:45.552 * Master replied to PING, replication can continue...
70795:S 14 Nov 2020 13:40:45.552 * Trying a partial resynchronization (request de83e39b5538aec381a9b7cb415f0afd9e9fb69e:1).
70795:S 14 Nov 2020 13:40:45.553 * Full resync from master: 36c2011a9e000813ddbe35ea240fd347ad351a04:0
70795:S 14 Nov 2020 13:40:45.553 * Discarding previously cached master state.
70795:S 14 Nov 2020 13:40:45.634 * MASTER <-> REPLICA sync: receiving 175 bytes from master to disk
70795:S 14 Nov 2020 13:40:45.634 * MASTER <-> REPLICA sync: Flushing old data
70795:S 14 Nov 2020 13:40:45.659 * MASTER <-> REPLICA sync: Loading DB in memory
70795:S 14 Nov 2020 13:40:45.660 * Loading RDB produced by version 6.0.8
70795:S 14 Nov 2020 13:40:45.660 * RDB age 0 seconds
70795:S 14 Nov 2020 13:40:45.660 * RDB memory usage when created 2.46 Mb
70795:S 14 Nov 2020 13:40:45.660 * MASTER <-> REPLICA sync: Finished with success
70795:S 14 Nov 2020 13:40:45.661 * Background append only file rewriting started by pid 85384
70795:S 14 Nov 2020 13:40:45.700 * AOF rewrite child asks to stop sending diffs.
85384:C 14 Nov 2020 13:40:45.700 * Parent agreed to stop sending diffs. Finalizing AOF...
85384:C 14 Nov 2020 13:40:45.700 * Concatenating 0.00 MB of AOF diff received from parent.
85384:C 14 Nov 2020 13:40:45.700 * SYNC append only file rewrite performed
85384:C 14 Nov 2020 13:40:45.701 * AOF rewrite: 0 MB of memory used by copy-on-write
70795:S 14 Nov 2020 13:40:45.762 * Background AOF rewrite terminated with success
70795:S 14 Nov 2020 13:40:45.762 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
70795:S 14 Nov 2020 13:40:45.762 * Background AOF rewrite finished successfully

操作集群存取数据

我们在服务器的输出日志上可以看到多个插槽的输出,那么接下来看看数据是如何存取的:

[root@192 redis-6.0.8]# redis-cli -h 192.168.5.101
192.168.5.101:6379> set love guochengyu
(error) MOVED 16198 192.168.5.103:6379

可以看的出存储到了插槽16198,被路由到了master3上了,所以需要使用重定向的方式插入和获取,只需要直接在客户端命令后边加个-c即可:

[root@192 redis-6.0.8]#  redis-cli -h 192.168.5.101 -c
192.168.5.101:6379> set love guochenyubaobei
-> Redirected to slot [16198] located at 192.168.5.103:6379
OK
192.168.5.103:6379> get love
"guochenyubaobei"
192.168.5.103:6379>

集群的主从切换

主从切换分为两种,一种是主服务器下线,一种是从服务器下线。

小小的slave丢了无所谓

一个 从节点下线,可以从主的日志中看到,10秒连接不上,就下线,下线后再上线主向从节点复制数据。其它主从会记录整个集群状态:

关闭Redis的101master对应的从102slave

20189:signal-handler (1605334497) Received SIGINT scheduling shutdown...
20189:S 14 Nov 2020 14:14:57.915 # User requested shutdown...
20189:S 14 Nov 2020 14:14:57.915 * Calling fsync() on the AOF file.
20189:S 14 Nov 2020 14:14:57.915 * Saving the final RDB snapshot before exiting.
20189:S 14 Nov 2020 14:14:57.916 * DB saved on disk
20189:S 14 Nov 2020 14:14:57.916 # Redis is now ready to exit, bye bye...
[root@192 config]#

其对应的101master节点lost了自己的从节点:

70362:M 14 Nov 2020 14:14:57.919 # Connection with replica 192.168.5.102:6380 lost.
70362:M 14 Nov 2020 14:15:10.105 * FAIL message received from 6c1005a89742e50db240775204c03ab3d7558e59 about 13ab2f0291cea595f51d6efac60c3e62278e64cb

而这个消息是101master节点从103master节点那儿听说的,所以103主节点上第一次标记了信息:

9496:M 14 Nov 2020 14:15:10.099 * Marking node 13ab2f0291cea595f51d6efac60c3e62278e64cb as failing (quorum reached).

其它几个节点也收到了同样的通知。

重启Redis的101master对应的从102slave

当把该102从节点【slave】上线的时候,所有节点收到通知取消了标记:

9496:M 14 Nov 2020 14:23:29.425 * Clear FAIL state for node 13ab2f0291cea595f51d6efac60c3e62278e64cb: replica is reachable again.

同时102从节点【slave】对应的master还负责一些信息的主从同步:

70362:M 14 Nov 2020 14:23:29.463 * Clear FAIL state for node 13ab2f0291cea595f51d6efac60c3e62278e64cb: replica is reachable again.
70362:M 14 Nov 2020 14:23:30.381 * Replica 192.168.5.102:6380 asks for synchronization
70362:M 14 Nov 2020 14:23:30.381 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'eeea1b2c358998aacc3dcf525484245183fcb9ab', my replication IDs are '64bd8964903df1d43df728eed12e7c00956674f3' and '0000000000000000000000000000000000000000')
70362:M 14 Nov 2020 14:23:30.381 * Starting BGSAVE for SYNC with target: disk
70362:M 14 Nov 2020 14:23:30.382 * Background saving started by pid 86035
86035:C 14 Nov 2020 14:23:30.386 * DB saved on disk
86035:C 14 Nov 2020 14:23:30.388 * RDB: 0 MB of memory used by copy-on-write
70362:M 14 Nov 2020 14:23:30.470 * Background saving terminated with success
70362:M 14 Nov 2020 14:23:30.470 * Synchronization with replica 192.168.5.102:6380 succeeded

然后102从节点【slave】自然而然的需要从主同步数据。

21392:M 14 Nov 2020 14:23:29.366 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
21392:M 14 Nov 2020 14:23:29.366 # Server initialized
21392:M 14 Nov 2020 14:23:29.366 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
21392:M 14 Nov 2020 14:23:29.366 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
21392:M 14 Nov 2020 14:23:29.367 * Reading RDB preamble from AOF file...
21392:M 14 Nov 2020 14:23:29.367 * Loading RDB produced by version 6.0.8
21392:M 14 Nov 2020 14:23:29.367 * RDB age 2563 seconds
21392:M 14 Nov 2020 14:23:29.367 * RDB memory usage when created 2.43 Mb
21392:M 14 Nov 2020 14:23:29.367 * RDB has an AOF tail
21392:M 14 Nov 2020 14:23:29.367 * Reading the remaining AOF tail...
21392:M 14 Nov 2020 14:23:29.367 * DB loaded from append only file: 0.000 seconds
21392:M 14 Nov 2020 14:23:29.367 * Ready to accept connections
21392:S 14 Nov 2020 14:23:29.367 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
21392:S 14 Nov 2020 14:23:29.367 # Cluster state changed: ok
21392:S 14 Nov 2020 14:23:30.374 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:23:30.374 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:23:30.375 * Non blocking connect for SYNC fired the event.
21392:S 14 Nov 2020 14:23:30.377 * Master replied to PING, replication can continue...
21392:S 14 Nov 2020 14:23:30.379 * Trying a partial resynchronization (request eeea1b2c358998aacc3dcf525484245183fcb9ab:1).
21392:S 14 Nov 2020 14:23:30.381 * Full resync from master: 64bd8964903df1d43df728eed12e7c00956674f3:2856
21392:S 14 Nov 2020 14:23:30.381 * Discarding previously cached master state.
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: receiving 176 bytes from master to disk
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Flushing old data
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Loading DB in memory
21392:S 14 Nov 2020 14:23:30.469 * Loading RDB produced by version 6.0.8
21392:S 14 Nov 2020 14:23:30.469 * RDB age 0 seconds
21392:S 14 Nov 2020 14:23:30.469 * RDB memory usage when created 2.45 Mb
21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Finished with success
21392:S 14 Nov 2020 14:23:30.469 * Background append only file rewriting started by pid 21396
21392:S 14 Nov 2020 14:23:30.519 * AOF rewrite child asks to stop sending diffs.
21396:C 14 Nov 2020 14:23:30.520 * Parent agreed to stop sending diffs. Finalizing AOF...
21396:C 14 Nov 2020 14:23:30.520 * Concatenating 0.00 MB of AOF diff received from parent.
21396:C 14 Nov 2020 14:23:30.520 * SYNC append only file rewrite performed
21396:C 14 Nov 2020 14:23:30.520 * AOF rewrite: 0 MB of memory used by copy-on-write
21392:S 14 Nov 2020 14:23:30.576 * Background AOF rewrite terminated with success
21392:S 14 Nov 2020 14:23:30.576 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
21392:S 14 Nov 2020 14:23:30.576 * Background AOF rewrite finished successfully

王朝更迭,master变slave

接下来进入年度大戏,王朝更迭,老master一段时间失联,被slave谋朝篡位,当它再回来的时候,只能屈居为一个小小的slave。一个 主节点下线,从节点尝试重连,连到10秒【10次】,认为主节点失败,自己申请成为主节点,主重新连接后成为了slave,已经被改朝换代了。其它主从会记录整个集群状态。

关闭Redis的101master

关闭后自己的从节点102salve试图去按照配置连接主节点,但是连不上:

21392:S 14 Nov 2020 14:30:21.427 # Connection with master lost.
21392:S 14 Nov 2020 14:30:21.427 * Caching the disconnected master state.
21392:S 14 Nov 2020 14:30:21.500 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:21.500 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:21.500 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:22.508 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:22.509 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:22.509 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:23.517 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:23.517 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:23.517 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:24.525 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:24.525 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:24.526 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:25.530 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:25.530 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:25.530 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:26.538 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:26.538 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:26.538 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:27.546 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:27.546 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:27.546 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:28.556 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:28.556 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:28.556 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:29.566 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:29.566 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:29.567 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:30.574 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:30.574 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:30.575 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:31.580 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:31.580 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:31.580 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:32.588 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:32.588 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:32.588 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:33.596 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:33.596 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:33.596 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:34.246 * FAIL message received from 6c1005a89742e50db240775204c03ab3d7558e59 about 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b
21392:S 14 Nov 2020 14:30:34.246 # Cluster state changed: fail
21392:S 14 Nov 2020 14:30:34.304 # Start of election delayed for 792 milliseconds (rank #0, offset 3430).
21392:S 14 Nov 2020 14:30:34.605 * Connecting to MASTER 192.168.5.101:6379
21392:S 14 Nov 2020 14:30:34.605 * MASTER <-> REPLICA sync started
21392:S 14 Nov 2020 14:30:34.606 # Error condition on socket for SYNC: Operation now in progress
21392:S 14 Nov 2020 14:30:35.108 # Starting a failover election for epoch 7.
21392:S 14 Nov 2020 14:30:35.110 # Failover election won: I'm the new master.
21392:S 14 Nov 2020 14:30:35.110 # configEpoch set to 7 after successful failover
21392:M 14 Nov 2020 14:30:35.110 * Discarding previously cached master state.
21392:M 14 Nov 2020 14:30:35.110 # Setting secondary replication ID to 64bd8964903df1d43df728eed12e7c00956674f3, valid up to offset: 3431. New replication ID is 3088efe0ca743a931b3863cbb0fd673811c31b7a
21392:M 14 Nov 2020 14:30:35.110 # Cluster state changed: ok

于是谋朝篡位将自己设置为主,同时其他节点也都收到了通知。我们查看集群信息如下:

[root@192 redis-6.0.8]#  redis-cli -h 192.168.5.102 -c                    
192.168.5.102:6379> cluster nodes
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 master - 0 1605335617797 7 connected 0-5460
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605335615777 2 connected
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 master,fail - 1605335422306 1605335419000 1 disconnected
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605335614000 3 connected 10923-16383
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605335616788 3 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 myself,master - 0 1605335615000 2 connected 5461-10922
192.168.5.102:6379>

可以看到一个失败的master和一个篡位成功的master

重启Redis的101master,奥不,现在是101slave,作为102master的奴隶

虽然重新启动了,但是只能作为一个slave存在了,所以启动后它需要从主同步数据

86216:M 14 Nov 2020 14:35:39.278 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
86216:M 14 Nov 2020 14:35:39.278 # Server initialized
86216:M 14 Nov 2020 14:35:39.278 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.
86216:M 14 Nov 2020 14:35:39.278 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never').
86216:M 14 Nov 2020 14:35:39.278 * Ready to accept connections
86216:M 14 Nov 2020 14:35:39.279 # Configuration change detected. Reconfiguring myself as a replica of 13ab2f0291cea595f51d6efac60c3e62278e64cb
86216:S 14 Nov 2020 14:35:39.279 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.
86216:S 14 Nov 2020 14:35:39.279 # Cluster state changed: ok
86216:S 14 Nov 2020 14:35:40.285 * Connecting to MASTER 192.168.5.102:6380
86216:S 14 Nov 2020 14:35:40.285 * MASTER <-> REPLICA sync started
86216:S 14 Nov 2020 14:35:40.285 * Non blocking connect for SYNC fired the event.
86216:S 14 Nov 2020 14:35:40.286 * Master replied to PING, replication can continue...
86216:S 14 Nov 2020 14:35:40.287 * Trying a partial resynchronization (request 647cca874c4c83a50d4fe5f82690eb51df36aa6d:1).
86216:S 14 Nov 2020 14:35:40.287 * Full resync from master: 3088efe0ca743a931b3863cbb0fd673811c31b7a:3430
86216:S 14 Nov 2020 14:35:40.287 * Discarding previously cached master state.
86216:S 14 Nov 2020 14:35:40.307 * MASTER <-> REPLICA sync: receiving 176 bytes from master to disk
86216:S 14 Nov 2020 14:35:40.307 * MASTER <-> REPLICA sync: Flushing old data
86216:S 14 Nov 2020 14:35:40.309 * MASTER <-> REPLICA sync: Loading DB in memory
86216:S 14 Nov 2020 14:35:40.309 * Loading RDB produced by version 6.0.8
86216:S 14 Nov 2020 14:35:40.309 * RDB age 0 seconds
86216:S 14 Nov 2020 14:35:40.309 * RDB memory usage when created 2.45 Mb
86216:S 14 Nov 2020 14:35:40.309 * MASTER <-> REPLICA sync: Finished with success
86216:S 14 Nov 2020 14:35:40.309 * Background append only file rewriting started by pid 86220
86216:S 14 Nov 2020 14:35:40.349 * AOF rewrite child asks to stop sending diffs.
86220:C 14 Nov 2020 14:35:40.349 * Parent agreed to stop sending diffs. Finalizing AOF...
86220:C 14 Nov 2020 14:35:40.349 * Concatenating 0.00 MB of AOF diff received from parent.
86220:C 14 Nov 2020 14:35:40.350 * SYNC append only file rewrite performed
86220:C 14 Nov 2020 14:35:40.350 * AOF rewrite: 0 MB of memory used by copy-on-write
86216:S 14 Nov 2020 14:35:40.387 * Background AOF rewrite terminated with success
86216:S 14 Nov 2020 14:35:40.387 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
86216:S 14 Nov 2020 14:35:40.387 * Background AOF rewrite finished successfully

再次查看集群状态:

192.168.5.102:6379> cluster nodes
13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 master - 0 1605335837000 7 connected 0-5460
46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605335837414 2 connected
2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 slave 13ab2f0291cea595f51d6efac60c3e62278e64cb 0 1605335837000 7 connected
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605335837000 3 connected 10923-16383
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605335838422 3 connected
94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 myself,master - 0 1605335836000 2 connected 5461-10922
192.168.5.102:6379>


相关实践学习
基于Redis实现在线游戏积分排行榜
本场景将介绍如何基于Redis数据库实现在线游戏中的游戏玩家积分排行榜功能。
云数据库 Redis 版使用教程
云数据库Redis版是兼容Redis协议标准的、提供持久化的内存数据库服务,基于高可靠双机热备架构及可无缝扩展的集群架构,满足高读写性能场景及容量需弹性变配的业务需求。 产品详情:https://www.aliyun.com/product/kvstore &nbsp; &nbsp; ------------------------------------------------------------------------- 阿里云数据库体验:数据库上云实战 开发者云会免费提供一台带自建MySQL的源数据库&nbsp;ECS 实例和一台目标数据库&nbsp;RDS实例。跟着指引,您可以一步步实现将ECS自建数据库迁移到目标数据库RDS。 点击下方链接,领取免费ECS&amp;RDS资源,30分钟完成数据库上云实战!https://developer.aliyun.com/adc/scenario/51eefbd1894e42f6bb9acacadd3f9121?spm=a2c6h.13788135.J_3257954370.9.4ba85f24utseFl
相关文章
|
1天前
|
存储 监控 NoSQL
Redis哨兵&分片集群
Redis哨兵&分片集群
6 0
|
2天前
|
存储 NoSQL Redis
深入浅出Redis(九):Redis的发布订阅模式
深入浅出Redis(九):Redis的发布订阅模式
|
2天前
|
NoSQL 算法 Java
深入浅出Redis(八):Redis的集群模式
深入浅出Redis(八):Redis的集群模式
|
8天前
|
NoSQL Redis
透视Redis集群:心跳检测如何维护高可用性
Redis心跳检测保障集群可靠性,通过PING命令检测主从连接状态,预防数据丢失。当连接异常时,自动触发主从切换。此外,心跳检测辅助实现`min-slaves-to-write`和`min-slaves-max-lag`策略,避免不安全写操作。还有重传机制,确保命令无丢失,维持数据一致性。合理配置心跳检测,能有效防止数据问题,提升Redis集群的高可用性。关注“软件求生”获取更多Redis知识!
109 10
透视Redis集群:心跳检测如何维护高可用性
|
10天前
|
监控 NoSQL 算法
Redis集群模式:高可用性与性能的完美结合!
小米探讨Redis集群模式,通过一致性哈希分散负载,主从节点确保高可用性。节点间健康检测、主备切换、数据复制与同步、分区策略和Majority选举机制保证服务可靠性。适合高可用性及性能需求场景,哨兵模式则适用于简单需求。一起学习技术的乐趣!关注小米微信公众号“软件求生”获取更多内容。
43 11
Redis集群模式:高可用性与性能的完美结合!
|
1月前
|
存储 NoSQL 算法
09- Redis分片集群中数据是怎么存储和读取的 ?
Redis分片集群使用哈希槽分区算法,包含16384个槽(0-16383)。数据存储时,通过CRC16算法对key计算并模16383,确定槽位,进而分配至对应节点。读取时,根据槽位找到相应节点直接操作。
66 12
|
1月前
|
NoSQL Linux Redis
06- 你们使用Redis是单点还是集群 ? 哪种集群 ?
**Redis配置:** 使用哨兵集群,结构为1主2从,加上3个哨兵节点,总计分布在3台Linux服务器上,提供高可用性。
361 0
|
1月前
|
负载均衡 监控 NoSQL
Redis的集群方案有哪些?
Redis集群包括主从复制(基础,手动故障恢复)、哨兵模式(自动高可用)和Redis Cluster(官方分布式解决方案,自动分片和容错)。此外,还有如Codis、Redisson和Twemproxy等第三方工具用于代理和负载均衡。选择方案需考虑应用场景、数据规模和并发需求。
290 2
|
11天前
|
监控 NoSQL Redis
|
16天前
|
NoSQL Redis Docker
使用Docker搭建Redis主从集群
使用Docker搭建Redis主从集群
32 1