启动集群,使用命令redis-trib.rb来启动集群
./redis-trib.rb create --replicas 1 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380
结果启动时报如下内容:
[root@192 src]# ./redis-trib.rb create --replicas 1 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 WARNING: redis-trib.rb is not longer available! You should use redis-cli instead. All commands and features belonging to redis-trib.rb have been moved to redis-cli. In order to use them you should call redis-cli with the --cluster option followed by the subcommand name, arguments and options. Use the following syntax: redis-cli --cluster SUBCOMMAND [ARGUMENTS] [OPTIONS] Example: redis-cli --cluster create 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 --cluster-replicas 1 To get help about all subcommands, type: redis-cli --cluster help
启动集群并查看启动日志
实践出真知啊,Redis新版本集群启用已经不再依赖于Ruby了,直接用Redis-cli即可:
[root@192 src]# redis-cli --cluster create 192.168.5.101:6379 192.168.5.102:6379 192.168.5.103:6379 192.168.5.102:6380 192.168.5.103:6380 192.168.5.101:6380 --cluster-replicas 1 >>> Performing hash slots allocation on 6 nodes... Master[0] -> Slots 0 - 5460 Master[1] -> Slots 5461 - 10922 Master[2] -> Slots 10923 - 16383 Adding replica 192.168.5.102:6380 to 192.168.5.101:6379 Adding replica 192.168.5.103:6380 to 192.168.5.102:6379 Adding replica 192.168.5.101:6380 to 192.168.5.103:6379 M: 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379 slots:[0-5460] (5461 slots) master M: 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379 slots:[5461-10922] (5462 slots) master M: 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379 slots:[10923-16383] (5461 slots) master S: 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380 replicates 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b S: 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380 replicates 94d9e1637bd1701b146e367ffa7a69e8c24566e8 S: bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380 replicates 6c1005a89742e50db240775204c03ab3d7558e59 Can I set the above configuration? (type 'yes' to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join . >>> Performing Cluster Check (using node 192.168.5.101:6379) M: 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379 slots:[0-5460] (5461 slots) master 1 additional replica(s) S: bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380 slots: (0 slots) slave replicates 6c1005a89742e50db240775204c03ab3d7558e59 S: 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380 slots: (0 slots) slave replicates 94d9e1637bd1701b146e367ffa7a69e8c24566e8 M: 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380 slots: (0 slots) slave replicates 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b M: 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379 slots:[10923-16383] (5461 slots) master 1 additional replica(s) [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. [root@192 src]#
可以看到各个的配置文件如下:
101机器上6379的node配置文件及服务器活动日志
bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605332446460 3 connected 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605332448477 2 connected 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 master - 0 1605332448000 2 connected 5461-10922 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 slave 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 0 1605332446000 1 connected 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605332447469 3 connected 10923-16383 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 myself,master - 0 1605332447000 1 connected 0-5460 vars currentEpoch 6 lastVoteEpoch 0
同时可以看下该主节点在集群启动过程中的活动日志:
70362:M 14 Nov 2020 13:40:43.171 # configEpoch set to 1 via CLUSTER SET-CONFIG-EPOCH 70362:M 14 Nov 2020 13:40:43.211 # IP address for this node updated to 192.168.5.101 70362:M 14 Nov 2020 13:40:46.150 * Replica 192.168.5.102:6380 asks for synchronization 70362:M 14 Nov 2020 13:40:46.150 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for '5c437fa82b0ca0caabfa2f0a17d15d9fc8f2548f', my replication IDs are 'd162ace4d79189e94f2a1f202c6fec4230a4189d' and '0000000000000000000000000000000000000000') 70362:M 14 Nov 2020 13:40:46.150 * Replication backlog created, my new replication IDs are '64bd8964903df1d43df728eed12e7c00956674f3' and '0000000000000000000000000000000000000000' 70362:M 14 Nov 2020 13:40:46.150 * Starting BGSAVE for SYNC with target: disk 70362:M 14 Nov 2020 13:40:46.151 * Background saving started by pid 85385 85385:C 14 Nov 2020 13:40:46.165 * DB saved on disk 85385:C 14 Nov 2020 13:40:46.167 * RDB: 0 MB of memory used by copy-on-write 70362:M 14 Nov 2020 13:40:46.258 * Background saving terminated with success 70362:M 14 Nov 2020 13:40:46.258 * Synchronization with replica 192.168.5.102:6380 succeeded 70362:M 14 Nov 2020 13:40:48.174 # Cluster state changed: ok
101机器上6380的node配置文件及服务器活动日志
6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605332445000 3 connected 10923-16383 bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 myself,slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605332446000 3 connected 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605332445452 2 connected 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 slave 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 0 1605332446000 1 connected 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 master - 0 1605332447480 2 connected 5461-10922 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 master - 0 1605332446468 1 connected 0-5460 vars currentEpoch 6 lastVoteEpoch 0
同时可以看下该从节点在集群启动过程中的活动日志:
70795:M 14 Nov 2020 13:40:43.173 # configEpoch set to 6 via CLUSTER SET-CONFIG-EPOCH 70795:M 14 Nov 2020 13:40:43.334 # IP address for this node updated to 192.168.5.101 70795:S 14 Nov 2020 13:40:45.182 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer. 70795:S 14 Nov 2020 13:40:45.182 # Cluster state changed: ok 70795:S 14 Nov 2020 13:40:45.551 * Connecting to MASTER 192.168.5.103:6379 70795:S 14 Nov 2020 13:40:45.551 * MASTER <-> REPLICA sync started 70795:S 14 Nov 2020 13:40:45.551 * Non blocking connect for SYNC fired the event. 70795:S 14 Nov 2020 13:40:45.552 * Master replied to PING, replication can continue... 70795:S 14 Nov 2020 13:40:45.552 * Trying a partial resynchronization (request de83e39b5538aec381a9b7cb415f0afd9e9fb69e:1). 70795:S 14 Nov 2020 13:40:45.553 * Full resync from master: 36c2011a9e000813ddbe35ea240fd347ad351a04:0 70795:S 14 Nov 2020 13:40:45.553 * Discarding previously cached master state. 70795:S 14 Nov 2020 13:40:45.634 * MASTER <-> REPLICA sync: receiving 175 bytes from master to disk 70795:S 14 Nov 2020 13:40:45.634 * MASTER <-> REPLICA sync: Flushing old data 70795:S 14 Nov 2020 13:40:45.659 * MASTER <-> REPLICA sync: Loading DB in memory 70795:S 14 Nov 2020 13:40:45.660 * Loading RDB produced by version 6.0.8 70795:S 14 Nov 2020 13:40:45.660 * RDB age 0 seconds 70795:S 14 Nov 2020 13:40:45.660 * RDB memory usage when created 2.46 Mb 70795:S 14 Nov 2020 13:40:45.660 * MASTER <-> REPLICA sync: Finished with success 70795:S 14 Nov 2020 13:40:45.661 * Background append only file rewriting started by pid 85384 70795:S 14 Nov 2020 13:40:45.700 * AOF rewrite child asks to stop sending diffs. 85384:C 14 Nov 2020 13:40:45.700 * Parent agreed to stop sending diffs. Finalizing AOF... 85384:C 14 Nov 2020 13:40:45.700 * Concatenating 0.00 MB of AOF diff received from parent. 85384:C 14 Nov 2020 13:40:45.700 * SYNC append only file rewrite performed 85384:C 14 Nov 2020 13:40:45.701 * AOF rewrite: 0 MB of memory used by copy-on-write 70795:S 14 Nov 2020 13:40:45.762 * Background AOF rewrite terminated with success 70795:S 14 Nov 2020 13:40:45.762 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 70795:S 14 Nov 2020 13:40:45.762 * Background AOF rewrite finished successfully
操作集群存取数据
我们在服务器的输出日志上可以看到多个插槽的输出,那么接下来看看数据是如何存取的:
[root@192 redis-6.0.8]# redis-cli -h 192.168.5.101 192.168.5.101:6379> set love guochengyu (error) MOVED 16198 192.168.5.103:6379
可以看的出存储到了插槽16198,被路由到了master3上了,所以需要使用重定向的方式插入和获取,只需要直接在客户端命令后边加个-c即可:
[root@192 redis-6.0.8]# redis-cli -h 192.168.5.101 -c 192.168.5.101:6379> set love guochenyubaobei -> Redirected to slot [16198] located at 192.168.5.103:6379 OK 192.168.5.103:6379> get love "guochenyubaobei" 192.168.5.103:6379>
集群的主从切换
主从切换分为两种,一种是主服务器下线,一种是从服务器下线。
小小的slave丢了无所谓
一个 从节点下线,可以从主的日志中看到,10秒连接不上,就下线,下线后再上线主向从节点复制数据。其它主从会记录整个集群状态:
关闭Redis的101master对应的从102slave
20189:signal-handler (1605334497) Received SIGINT scheduling shutdown... 20189:S 14 Nov 2020 14:14:57.915 # User requested shutdown... 20189:S 14 Nov 2020 14:14:57.915 * Calling fsync() on the AOF file. 20189:S 14 Nov 2020 14:14:57.915 * Saving the final RDB snapshot before exiting. 20189:S 14 Nov 2020 14:14:57.916 * DB saved on disk 20189:S 14 Nov 2020 14:14:57.916 # Redis is now ready to exit, bye bye... [root@192 config]#
其对应的101master节点lost了自己的从节点:
70362:M 14 Nov 2020 14:14:57.919 # Connection with replica 192.168.5.102:6380 lost. 70362:M 14 Nov 2020 14:15:10.105 * FAIL message received from 6c1005a89742e50db240775204c03ab3d7558e59 about 13ab2f0291cea595f51d6efac60c3e62278e64cb
而这个消息是101master节点从103master节点那儿听说的,所以103主节点上第一次标记了信息:
9496:M 14 Nov 2020 14:15:10.099 * Marking node 13ab2f0291cea595f51d6efac60c3e62278e64cb as failing (quorum reached).
其它几个节点也收到了同样的通知。
重启Redis的101master对应的从102slave
当把该102从节点【slave】上线的时候,所有节点收到通知取消了标记:
9496:M 14 Nov 2020 14:23:29.425 * Clear FAIL state for node 13ab2f0291cea595f51d6efac60c3e62278e64cb: replica is reachable again.
同时102从节点【slave】对应的master还负责一些信息的主从同步:
70362:M 14 Nov 2020 14:23:29.463 * Clear FAIL state for node 13ab2f0291cea595f51d6efac60c3e62278e64cb: replica is reachable again. 70362:M 14 Nov 2020 14:23:30.381 * Replica 192.168.5.102:6380 asks for synchronization 70362:M 14 Nov 2020 14:23:30.381 * Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'eeea1b2c358998aacc3dcf525484245183fcb9ab', my replication IDs are '64bd8964903df1d43df728eed12e7c00956674f3' and '0000000000000000000000000000000000000000') 70362:M 14 Nov 2020 14:23:30.381 * Starting BGSAVE for SYNC with target: disk 70362:M 14 Nov 2020 14:23:30.382 * Background saving started by pid 86035 86035:C 14 Nov 2020 14:23:30.386 * DB saved on disk 86035:C 14 Nov 2020 14:23:30.388 * RDB: 0 MB of memory used by copy-on-write 70362:M 14 Nov 2020 14:23:30.470 * Background saving terminated with success 70362:M 14 Nov 2020 14:23:30.470 * Synchronization with replica 192.168.5.102:6380 succeeded
然后102从节点【slave】自然而然的需要从主同步数据。
21392:M 14 Nov 2020 14:23:29.366 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 21392:M 14 Nov 2020 14:23:29.366 # Server initialized 21392:M 14 Nov 2020 14:23:29.366 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 21392:M 14 Nov 2020 14:23:29.366 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never'). 21392:M 14 Nov 2020 14:23:29.367 * Reading RDB preamble from AOF file... 21392:M 14 Nov 2020 14:23:29.367 * Loading RDB produced by version 6.0.8 21392:M 14 Nov 2020 14:23:29.367 * RDB age 2563 seconds 21392:M 14 Nov 2020 14:23:29.367 * RDB memory usage when created 2.43 Mb 21392:M 14 Nov 2020 14:23:29.367 * RDB has an AOF tail 21392:M 14 Nov 2020 14:23:29.367 * Reading the remaining AOF tail... 21392:M 14 Nov 2020 14:23:29.367 * DB loaded from append only file: 0.000 seconds 21392:M 14 Nov 2020 14:23:29.367 * Ready to accept connections 21392:S 14 Nov 2020 14:23:29.367 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer. 21392:S 14 Nov 2020 14:23:29.367 # Cluster state changed: ok 21392:S 14 Nov 2020 14:23:30.374 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:23:30.374 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:23:30.375 * Non blocking connect for SYNC fired the event. 21392:S 14 Nov 2020 14:23:30.377 * Master replied to PING, replication can continue... 21392:S 14 Nov 2020 14:23:30.379 * Trying a partial resynchronization (request eeea1b2c358998aacc3dcf525484245183fcb9ab:1). 21392:S 14 Nov 2020 14:23:30.381 * Full resync from master: 64bd8964903df1d43df728eed12e7c00956674f3:2856 21392:S 14 Nov 2020 14:23:30.381 * Discarding previously cached master state. 21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: receiving 176 bytes from master to disk 21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Flushing old data 21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Loading DB in memory 21392:S 14 Nov 2020 14:23:30.469 * Loading RDB produced by version 6.0.8 21392:S 14 Nov 2020 14:23:30.469 * RDB age 0 seconds 21392:S 14 Nov 2020 14:23:30.469 * RDB memory usage when created 2.45 Mb 21392:S 14 Nov 2020 14:23:30.469 * MASTER <-> REPLICA sync: Finished with success 21392:S 14 Nov 2020 14:23:30.469 * Background append only file rewriting started by pid 21396 21392:S 14 Nov 2020 14:23:30.519 * AOF rewrite child asks to stop sending diffs. 21396:C 14 Nov 2020 14:23:30.520 * Parent agreed to stop sending diffs. Finalizing AOF... 21396:C 14 Nov 2020 14:23:30.520 * Concatenating 0.00 MB of AOF diff received from parent. 21396:C 14 Nov 2020 14:23:30.520 * SYNC append only file rewrite performed 21396:C 14 Nov 2020 14:23:30.520 * AOF rewrite: 0 MB of memory used by copy-on-write 21392:S 14 Nov 2020 14:23:30.576 * Background AOF rewrite terminated with success 21392:S 14 Nov 2020 14:23:30.576 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 21392:S 14 Nov 2020 14:23:30.576 * Background AOF rewrite finished successfully
王朝更迭,master变slave
接下来进入年度大戏,王朝更迭,老master一段时间失联,被slave谋朝篡位,当它再回来的时候,只能屈居为一个小小的slave。一个 主节点下线,从节点尝试重连,连到10秒【10次】,认为主节点失败,自己申请成为主节点,主重新连接后成为了slave,已经被改朝换代了。其它主从会记录整个集群状态。
关闭Redis的101master
关闭后自己的从节点102salve试图去按照配置连接主节点,但是连不上:
21392:S 14 Nov 2020 14:30:21.427 # Connection with master lost. 21392:S 14 Nov 2020 14:30:21.427 * Caching the disconnected master state. 21392:S 14 Nov 2020 14:30:21.500 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:21.500 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:21.500 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:22.508 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:22.509 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:22.509 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:23.517 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:23.517 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:23.517 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:24.525 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:24.525 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:24.526 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:25.530 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:25.530 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:25.530 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:26.538 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:26.538 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:26.538 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:27.546 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:27.546 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:27.546 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:28.556 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:28.556 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:28.556 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:29.566 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:29.566 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:29.567 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:30.574 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:30.574 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:30.575 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:31.580 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:31.580 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:31.580 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:32.588 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:32.588 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:32.588 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:33.596 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:33.596 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:33.596 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:34.246 * FAIL message received from 6c1005a89742e50db240775204c03ab3d7558e59 about 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 21392:S 14 Nov 2020 14:30:34.246 # Cluster state changed: fail 21392:S 14 Nov 2020 14:30:34.304 # Start of election delayed for 792 milliseconds (rank #0, offset 3430). 21392:S 14 Nov 2020 14:30:34.605 * Connecting to MASTER 192.168.5.101:6379 21392:S 14 Nov 2020 14:30:34.605 * MASTER <-> REPLICA sync started 21392:S 14 Nov 2020 14:30:34.606 # Error condition on socket for SYNC: Operation now in progress 21392:S 14 Nov 2020 14:30:35.108 # Starting a failover election for epoch 7. 21392:S 14 Nov 2020 14:30:35.110 # Failover election won: I'm the new master. 21392:S 14 Nov 2020 14:30:35.110 # configEpoch set to 7 after successful failover 21392:M 14 Nov 2020 14:30:35.110 * Discarding previously cached master state. 21392:M 14 Nov 2020 14:30:35.110 # Setting secondary replication ID to 64bd8964903df1d43df728eed12e7c00956674f3, valid up to offset: 3431. New replication ID is 3088efe0ca743a931b3863cbb0fd673811c31b7a 21392:M 14 Nov 2020 14:30:35.110 # Cluster state changed: ok
于是谋朝篡位将自己设置为主,同时其他节点也都收到了通知。我们查看集群信息如下:
[root@192 redis-6.0.8]# redis-cli -h 192.168.5.102 -c 192.168.5.102:6379> cluster nodes 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 master - 0 1605335617797 7 connected 0-5460 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605335615777 2 connected 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 master,fail - 1605335422306 1605335419000 1 disconnected 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605335614000 3 connected 10923-16383 bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605335616788 3 connected 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 myself,master - 0 1605335615000 2 connected 5461-10922 192.168.5.102:6379>
可以看到一个失败的master和一个篡位成功的master
重启Redis的101master,奥不,现在是101slave,作为102master的奴隶
虽然重新启动了,但是只能作为一个slave存在了,所以启动后它需要从主同步数据
86216:M 14 Nov 2020 14:35:39.278 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128. 86216:M 14 Nov 2020 14:35:39.278 # Server initialized 86216:M 14 Nov 2020 14:35:39.278 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect. 86216:M 14 Nov 2020 14:35:39.278 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo madvise > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled (set to 'madvise' or 'never'). 86216:M 14 Nov 2020 14:35:39.278 * Ready to accept connections 86216:M 14 Nov 2020 14:35:39.279 # Configuration change detected. Reconfiguring myself as a replica of 13ab2f0291cea595f51d6efac60c3e62278e64cb 86216:S 14 Nov 2020 14:35:39.279 * Before turning into a replica, using my own master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer. 86216:S 14 Nov 2020 14:35:39.279 # Cluster state changed: ok 86216:S 14 Nov 2020 14:35:40.285 * Connecting to MASTER 192.168.5.102:6380 86216:S 14 Nov 2020 14:35:40.285 * MASTER <-> REPLICA sync started 86216:S 14 Nov 2020 14:35:40.285 * Non blocking connect for SYNC fired the event. 86216:S 14 Nov 2020 14:35:40.286 * Master replied to PING, replication can continue... 86216:S 14 Nov 2020 14:35:40.287 * Trying a partial resynchronization (request 647cca874c4c83a50d4fe5f82690eb51df36aa6d:1). 86216:S 14 Nov 2020 14:35:40.287 * Full resync from master: 3088efe0ca743a931b3863cbb0fd673811c31b7a:3430 86216:S 14 Nov 2020 14:35:40.287 * Discarding previously cached master state. 86216:S 14 Nov 2020 14:35:40.307 * MASTER <-> REPLICA sync: receiving 176 bytes from master to disk 86216:S 14 Nov 2020 14:35:40.307 * MASTER <-> REPLICA sync: Flushing old data 86216:S 14 Nov 2020 14:35:40.309 * MASTER <-> REPLICA sync: Loading DB in memory 86216:S 14 Nov 2020 14:35:40.309 * Loading RDB produced by version 6.0.8 86216:S 14 Nov 2020 14:35:40.309 * RDB age 0 seconds 86216:S 14 Nov 2020 14:35:40.309 * RDB memory usage when created 2.45 Mb 86216:S 14 Nov 2020 14:35:40.309 * MASTER <-> REPLICA sync: Finished with success 86216:S 14 Nov 2020 14:35:40.309 * Background append only file rewriting started by pid 86220 86216:S 14 Nov 2020 14:35:40.349 * AOF rewrite child asks to stop sending diffs. 86220:C 14 Nov 2020 14:35:40.349 * Parent agreed to stop sending diffs. Finalizing AOF... 86220:C 14 Nov 2020 14:35:40.349 * Concatenating 0.00 MB of AOF diff received from parent. 86220:C 14 Nov 2020 14:35:40.350 * SYNC append only file rewrite performed 86220:C 14 Nov 2020 14:35:40.350 * AOF rewrite: 0 MB of memory used by copy-on-write 86216:S 14 Nov 2020 14:35:40.387 * Background AOF rewrite terminated with success 86216:S 14 Nov 2020 14:35:40.387 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB) 86216:S 14 Nov 2020 14:35:40.387 * Background AOF rewrite finished successfully
再次查看集群状态:
192.168.5.102:6379> cluster nodes 13ab2f0291cea595f51d6efac60c3e62278e64cb 192.168.5.102:6380@16380 master - 0 1605335837000 7 connected 0-5460 46be63362f47ece342c54f4e042ef09d3ca0ec1b 192.168.5.103:6380@16380 slave 94d9e1637bd1701b146e367ffa7a69e8c24566e8 0 1605335837414 2 connected 2eab309dd5f41f317bd1c2b0c8616aee7e4ac05b 192.168.5.101:6379@16379 slave 13ab2f0291cea595f51d6efac60c3e62278e64cb 0 1605335837000 7 connected 6c1005a89742e50db240775204c03ab3d7558e59 192.168.5.103:6379@16379 master - 0 1605335837000 3 connected 10923-16383 bea352a33ee43ca0f6e5238a8170a01820af7f93 192.168.5.101:6380@16380 slave 6c1005a89742e50db240775204c03ab3d7558e59 0 1605335838422 3 connected 94d9e1637bd1701b146e367ffa7a69e8c24566e8 192.168.5.102:6379@16379 myself,master - 0 1605335836000 2 connected 5461-10922 192.168.5.102:6379>