redis是一个内存数据库,数据保存在内存中,redis提供了持久化的机制,分别是RDB(Redis DataBase)和AOF(Append only File)
持久化流程
redis的数据保存到磁盘上,大致流程如下:
- 客户端向服务端发送写操作(数据在客户端的内存中)
- 数据库服务端接收到写请求的数据(数据在服务端的内存中)
- 服务端调用write系统调用,将数据写到磁盘上(数据在系统内存的缓冲区中国)
- 操作系统将缓冲区中的数据转移到磁盘控制器中(数据在磁盘缓存中)
- 磁盘控制器将数据写到磁盘的物理介质中(数据真正的写到了磁盘中 )
机器故障的两种情况:
- Redis 数据库发送故障:只要在上面的第三步执行完毕,那么就可以持久化保存,剩下的两步由操作系统替我们完成。
- 操作系统发生故障,必须完成上面5步才可以。
redis如何来实现上面5个保存磁盘的步骤。它提供了两种策略机制,也就是RDB和AOF。
RDB机制
rdb其实就是把数据以快照的形式保存在磁盘上。
快照:就是保存当前的数据和数据状态。
想必大家用过虚拟机的一下子就清楚了:
rdb持久化是指在指定的时间间隔内将内存中的数据集快照写入磁盘中,rdb是redis默认的持久化方式。
rdb持久化就是将内存中的数据以快照的方式写入到二进制文件中,默认的文件名为dump.rdb
rdb的触发机制有三种:save,bgsave,自动化触发。
redis.conf中rdb的配置
有关rdb的配置在redis.conf中的snapshotting中
################################ SNAPSHOTTING ################################
# Save the DB to disk.
#
# save <seconds> <changes> [<seconds> <changes> ...]
#
# Redis will save the DB if the given number of seconds elapsed and it
# surpassed the given number of write operations against the DB.
#
# Snapshotting can be completely disabled with a single empty string argument
# as in following example:
#
# save ""
#
# Unless specified otherwise, by default Redis will save the DB:
# * After 3600 seconds (an hour) if at least 1 change was performed
# * After 300 seconds (5 minutes) if at least 100 changes were performed
# * After 60 seconds if at least 10000 changes were performed
#
# You can set these explicitly by uncommenting the following line.
#
# save 3600 1 300 100 60 10000
# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes
# Compress string objects using LZF when dump .rdb databases?
# By default compression is enabled as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes
# Enables or disables full sanitization checks for ziplist and listpack etc when
# loading an RDB or RESTORE payload. This reduces the chances of a assertion or
# crash later on while processing commands.
# Options:
# no - Never perform full sanitization
# yes - Always perform full sanitization
# clients - Perform full sanitization only for user connections.
# Excludes: RDB files, RESTORE commands received from the master
# connection, and client connections which have the
# skip-sanitize-payload ACL flag.
# The default should be 'clients' but since it currently affects cluster
# resharding via MIGRATE, it is temporarily set to 'no' by default.
#
# sanitize-dump-payload no
# The filename where to dump the DB
dbfilename dump.rdb
# Remove RDB files used by replication in instances without persistence
# enabled. By default this option is disabled, however there are environments
# where for regulations or other security concerns, RDB files persisted on
# disk by masters in order to feed replicas, or stored on disk by replicas
# in order to load them for the initial synchronization, should be deleted
# ASAP. Note that this option ONLY WORKS in instances that have both AOF
# and RDB persistence disabled, otherwise is completely ignored.
#
# An alternative (and sometimes better) way to obtain the same effect is
# to use diskless replication on both master and replicas instances. However
# in the case of replicas, diskless is not always an option.
rdb-del-sync-files no
# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir ./
stop-writes-on-bgsave-error
:持久化如果出错,是否还需要继续工作,默认值为yesrdbcompression
:是否压缩存储磁盘中的快照,默认值是yes.rdbchecksum
:存储快照后是否进行数据校验,默认值为yes,(检查大约会增加10%的性能消耗)save <seconds> <change>
:这个指令告诉 Redis 在给定的秒数内,如果发生了指定数量的写操作,就执行一次快照操作.- 例如:
save 60 3
表示在60s内如果执行了3次redis命令操作,则执行一次快照操作。
- 例如:
dbfilename dump.rdb
: redis快照文件名默认为dump.rdbrdb-del-sync-files no
:这个配置决定是否删除用于复制的 RDB 文件。如果设置为 yes,在持久化被禁用的情况下,这些文件会被尽快删除。dir ./
:dir 配置项指定了 Redis 用于存储其持久化文件(如 RDB 和 AOF 文件)的目录save ""
开启这个配置的话,表示将完全禁用快照功能。
save bgsave 自动化 rdb触发的三种方式
save 命令
save
: save命令会阻塞当前的Redis服务器,执行sava命令期间,Redis不能处理其他命令,直到RDB过程完成为止。
bgsave命令
bgsave
:执行bgsave时,Redis会在后台进行快照操作,快照同时还可以响应客户端请求。
- Redis进程执行fork操作创建子进程,RDB持久化过程由子进程负责,完成后自动结束。阻塞只发生在fork阶段,一段时间很短。基本上Redis内部所以的rdb操作都是采用bgsave命令。
自动触发
自动触发:自动触发通过redis.conf中的配置文件来完成. 其中save <seconds> <changes>
配置项,可以根据配置项自动触发bgsave。
如: save 60 5
即只要60s内有5次修改key的值时,则触发rdb。
练习一下:首先去修改redis.conf的配置
vim /myredis/redis.conf # tips: 这里注意是自己redis.conf的路径
# 以下是修改内容 我们只需要指定自动触发的配置项,和存放.rdb文件的位置目录以及logfile日志即可(为了方便观察)
################################ SNAPSHOTTING ################################
save 60 5
dir /myredis/
################################# GENERAL #####################################
logfile "/myredis/redis.log"
修改完配置之后,我们从redis-cli中使用shutdown命令进行停止服务,然后重启redis-server,再次通过redis-cli用命令进行测试(5次修改key的值即可),步骤如下:
至此rdb的自动触发已经练习完毕啦 😀
tips: 记得将自己改动的配置再还原,现在只是为了演示rdb的自动触发效果。
rdb的优势劣势
① 优势
(1)RDB文件紧凑,全量备份,非常适合用于进行备份和灾难恢复。
(2)生成RDB文件的时候,redis主进程会fork()一个子进程来处理所有保存工作,主进程不需要进行任何磁盘IO操作。
(3)RDB 在恢复大数据集时的速度比 AOF 的恢复速度要快。
②、劣势
RDB快照是一次全量备份,存储的是内存数据的二进制序列化形式,存储上非常紧凑。当进行快照持久化时,会开启一个子进程专门负责快照持久化,子进程会拥有父进程的内存数据,父进程修改内存子进程不会反应出来,所以在快照持久化期间修改的数据不会被保存,可能丢失数据。
AOF
AOF,工作机制很简单,redis会将每一个收到的写命令都通过write函数追加到文件中。通俗的理解就是日志记录。
aof原理
每当有一个写命令过来时,就直接保存在aof文件中,
aof配置
aof的配置在redis.conf
中的append only mode
中
############################## APPEND ONLY MODE ###############################
# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check https://redis.io/topics/persistence for more information.
appendonly no
# The base name of the append only file.
#
# Redis 7 and newer use a set of append-only files to persist the dataset
# and changes applied to it. There are two basic types of files in use:
#
# - Base files, which are a snapshot representing the complete state of the
# dataset at the time the file was created. Base files can be either in
# the form of RDB (binary serialized) or AOF (textual commands).
# - Incremental files, which contain additional commands that were applied
# to the dataset following the previous file.
#
# In addition, manifest files are used to track the files and the order in
# which they were created and should be applied.
#
# Append-only file names are created by Redis following a specific pattern.
# The file name's prefix is based on the 'appendfilename' configuration
# parameter, followed by additional information about the sequence and type.
#
# For example, if appendfilename is set to appendonly.aof, the following file
# names could be derived:
#
# - appendonly.aof.1.base.rdb as a base file.
# - appendonly.aof.1.incr.aof, appendonly.aof.2.incr.aof as incremental files.
# - appendonly.aof.manifest as a manifest file.
appendfilename "appendonly.aof"
# For convenience, Redis stores all persistent append-only files in a dedicated
# directory. The name of the directory is determined by the appenddirname
# configuration parameter.
appenddirname "appendonlydir"
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
# everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".
# appendfsync always
appendfsync everysec
# appendfsync no
# When the AOF fsync policy is set to always or everysec, and a background
# saving process (a background save or AOF log background rewriting) is
# performing a lot of I/O against the disk, in some Linux configurations
# Redis may block too long on the fsync() call. Note that there is no fix for
# this currently, as even performing fsync in a different thread will block
# our synchronous write(2) call.
#
# In order to mitigate this problem it's possible to use the following option
# that will prevent fsync() from being called in the main process while a
# BGSAVE or BGREWRITEAOF is in progress.
#
# This means that while another child is saving, the durability of Redis is
# the same as "appendfsync no". In practical terms, this means that it is
# possible to lose up to 30 seconds of log in the worst scenario (with the
# default Linux settings).
#
# If you have latency problems turn this to "yes". Otherwise leave it as
# "no" that is the safest pick from the point of view of durability.
no-appendfsync-on-rewrite no
# Automatic rewrite of the append only file.
# Redis is able to automatically rewrite the log file implicitly calling
# BGREWRITEAOF when the AOF log size grows by the specified percentage.
#
# This is how it works: Redis remembers the size of the AOF file after the
# latest rewrite (if no rewrite has happened since the restart, the size of
# the AOF at startup is used).
#
# This base size is compared to the current size. If the current size is
# bigger than the specified percentage, the rewrite is triggered. Also
# you need to specify a minimal size for the AOF file to be rewritten, this
# is useful to avoid rewriting the AOF file even if the percentage increase
# is reached but it is still pretty small.
#
# Specify a percentage of zero in order to disable the automatic AOF
# rewrite feature.
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
# An AOF file may be found to be truncated at the end during the Redis
# startup process, when the AOF data gets loaded back into memory.
# This may happen when the system where Redis is running
# crashes, especially when an ext4 filesystem is mounted without the
# data=ordered option (however this can't happen when Redis itself
# crashes or aborts but the operating system still works correctly).
#
# Redis can either exit with an error when this happens, or load as much
# data as possible (the default now) and start if the AOF file is found
# to be truncated at the end. The following option controls this behavior.
#
# If aof-load-truncated is set to yes, a truncated AOF file is loaded and
# the Redis server starts emitting a log to inform the user of the event.
# Otherwise if the option is set to no, the server aborts with an error
# and refuses to start. When the option is set to no, the user requires
# to fix the AOF file using the "redis-check-aof" utility before to restart
# the server.
#
# Note that if the AOF file will be found to be corrupted in the middle
# the server will still exit with an error. This option only applies when
# Redis will try to read more data from the AOF file but not enough bytes
# will be found.
aof-load-truncated yes
# Redis can create append-only base files in either RDB or AOF formats. Using
# the RDB format is always faster and more efficient, and disabling it is only
# supported for backward compatibility purposes.
aof-use-rdb-preamble yes
# Redis supports recording timestamp annotations in the AOF to support restoring
# the data from a specific point-in-time. However, using this capability changes
# the AOF format in a way that may not be compatible with existing AOF parsers.
aof-timestamp-enabled no
在 Redis 的配置文件中,APPEND ONLY MODE
(AOF 模式)是与 Redis 的持久化策略相关的一个重要部分。AOF(Append Only File)持久化是通过保存 Redis 服务器接收到的所有写操作命令到文件中来实现的,当 Redis 重启时,它会通过重新执行这些命令来重建数据集。
以下是 APPEND ONLY MODE
相关的配置项及其解释:
appendonly
:
- 这个配置项用于开启或关闭 AOF 持久化功能。
- 设置为
yes
时,启用 AOF 持久化。 - 设置为
no
时,禁用 AOF 持久化。
appendfilename
:
- 这个配置项用于设置 AOF 文件的名称。
- 默认值是
appendonly.aof
,这意味着如果没有特别指定,Redis 将使用appendonly.aof
作为 AOF 文件的名称。
appenddirname
:- 这个配置项用于设置 AOF文件的存放目录
- 默认值是
appendonlydir
appendfsync
:
- 这个配置项控制 AOF 文件同步到磁盘的策略。
always
:每个写命令都立即同步到 AOF 文件,这是最慢但最安全的方式。everysec
:每秒同步一次,这是默认的推荐设置,它提供了很好的性能和持久性之间的平衡。no
:由操作系统决定何时同步,这可能会更快,但在某些操作系统和硬件配置下,可能会导致数据丢失。
no-appendfsync-on-rewrite
:
- 当 AOF 重写时,是否禁用
fsync
。 - 设置为
yes
时,重写期间不执行fsync
,这可能会加速重写过程,但可能会增加数据丢失的风险。 - 设置为
no
时,即使在重写过程中也会执行fsync
。(默认值)
auto-aof-rewrite-percentage
:
- 触发 AOF 重写的增长百分比阈值。
- 当 AOF 文件的大小比上一次 AOF 重写后的大小大指定的百分比时,将触发 AOF 重写。
- 默认值是 100,意味着 AOF 文件大小是上次重写后大小的两倍时,将触发重写。
auto-aof-rewrite-min-size
:
- AOF 重写的最小文件大小。
- 即使 AOF 文件的大小达到了
auto-aof-rewrite-percentage
指定的百分比,但如果 AOF 文件的大小小于这个值,重写也不会被触发。 - 这个配置项可以防止对非常小的 AOF 文件进行不必要的重写。
aof-load-truncated
:
- 当 AOF 文件出现截断错误时,Redis 是否应该尝试加载 AOF 文件。
- 设置为
yes
时,即使 AOF 文件出现截断错误,Redis 也会尝试加载 AOF 文件。(默认值) - 设置为
no
时,如果 AOF 文件出现截断错误,Redis 将退出并报告错误。
aof-use-rdb-preamble
:
- 这个配置项决定了 AOF 文件的格式。
- 设置为
yes
时,AOF 文件的开头将包含一个 RDB 格式的数据快照,之后跟着 AOF 格式的命令。这有助于加速数据加载过程。 - 设置为
no
时,AOF 文件只包含 AOF 格式的命令。(默认值)
aof-timestamp-enabled
:- 设置AOF持久化时,是否要记录时间戳,默认为
no
- 设置AOF持久化时,是否要记录时间戳,默认为
这些配置项提供了丰富的控制选项,允许你根据应用程序的需求和硬件性能来微调 Redis 的 AOF 持久化行为。请确保在修改这些配置项之前,你理解了每个选项的含义和潜在影响,并在修改后进行适当的测试。
文件重写原理
aof的方式会带来一个问题,就是持久化文件会变的越来越大。为了压缩aof的持久化文件。redis提供了bgrewriteaof命令。将内存中的数据以命令的方式保存到临时文件中,同时会fork出一条新进程来将文件重写。
重写aof文件的操作,并没有读取旧的aof文件,而是将整个内存中的数据库内容用命令的方式重写了一个新的aof文件,这点和快照有点类似。
aof的三种触发机制 appendfsync
(1)每修改同步always:同步持久化 每次发生数据变更会被立即记录到磁盘 性能较差但数据完整性比较好 appendfsync always
(2)每秒同步everysec (默认配置):异步操作,每秒记录 如果一秒内宕机,有数据丢失。appendfsync everysec
(3)不同no:从不同步 appendfsync no
aof fix工具 redis-check-aof
Redis 提供了一个名为 redis-aof-rewrite
的工具,该工具用于对 AOF(Append Only File)文件进行重写,以减少文件大小并提高性能。然而,这并不是一个用于“修补”损坏的 AOF 文件的工具。如果 AOF 文件损坏,你可能需要使用其他方法来恢复数据。
对于损坏的 AOF 文件,Redis 提供了一个名为 redis-check-aof
的工具,用于检查和修复 AOF 文件的错误。这个工具可以帮助你识别并修复 AOF 文件中的格式错误或损坏的条目,从而尝试恢复数据。
要使用 redis-check-aof
工具,你可以按照以下步骤操作:
- 确保 Redis 服务器已经停止运行,以避免对 AOF 文件进行写入操作。
- 打开终端或命令提示符,并导航到 Redis 安装目录。
- 运行以下命令来检查 AOF 文件:
redis-check-aof --fix <aof-file>
其中 <aof-file>
是你的 AOF 文件的路径和名称。
redis-check-aof
工具会尝试修复 AOF 文件中的错误,并将结果输出到终端或命令提示符。- 根据工具输出的信息,确定是否成功修复了 AOF 文件。如果工具报告了一些无法修复的错误,你可能需要考虑从备份中恢复数据。
- 如果 AOF 文件修复成功,你可以重新启动 Redis 服务器,并配置它使用修复后的 AOF 文件。
请注意,redis-check-aof
工具并不保证能够完全恢复损坏的 AOF 文件中的所有数据。在某些情况下,如果 AOF 文件损坏严重,你可能无法恢复所有数据。因此,定期备份 Redis 数据是非常重要的,以便在发生数据丢失或损坏时能够恢复数据。
练习aof
首先,先在redis.conf中开启appendonly appendonly yes
vim /myredis/redis7.conf # redis.conf的实际存放位置
# 下面是修改的内容
############################## APPEND ONLY MODE ###############################
appendonly yes # 开启aof
Tips:每次修改完配置后,记得重启redis服务器!不然配置不生效
改动之后,我们重启服务器,观察redis目录中,即可发现 appendonlydir
[root@192 myredis]# cd appendonlydir/
[root@192 appendonlydir]# ll
总用量 8
-rw-r--r--. 1 root root 88 2月 22 17:27 appendonly.aof.1.base.rdb
-rw-r--r--. 1 root root 0 2月 22 17:27 appendonly.aof.1.incr.aof
-rw-r--r--. 1 root root 88 2月 22 17:27 appendonly.aof.manifest
[root@192 appendonlydir]#
接着我们使用redis客户端,进行一些操作,然后再来观察 cat appendonly.aof.1.incr.aof
aof的优缺点
- 优点
(1)AOF可以更好的保护数据不丢失,一般AOF会每隔1秒,通过一个后台线程执行一次fsync操作,最多丢失1秒钟的数据。
(2)AOF日志文件没有任何磁盘寻址的开销,写入性能非常高,文件不容易破损。
(3)AOF日志文件即使过大的时候,出现后台重写操作,也不会影响客户端的读写。
(4)AOF日志文件的命令通过非常可读的方式进行记录,这个特性非常适合做灾难性的误删除的紧急恢复。比如某人不小心用flushall命令清空了所有数据,只要这个时候后台rewrite还没有发生,那么就可以立即拷贝AOF文件,将最后一条flushall命令给删了,然后再将该AOF文件放回去,就可以通过恢复机制,自动恢复所有数据
- 缺点
(1)对于同一份数据来说,AOF日志文件通常比RDB数据快照文件更大
(2)AOF开启后,支持的写QPS会比RDB支持的写QPS低,因为AOF一般会配置成每秒fsync一次日志文件,当然,每秒一次fsync,性能也还是很高的
(3)以前AOF发生过bug,就是通过AOF记录的日志,进行数据恢复的时候,没有恢复一模一样的数据出来。