[MySQL] 二进制文件

本文涉及的产品
云数据库 RDS MySQL,集群系列 2核4GB
推荐场景:
搭建个人博客
RDS MySQL Serverless 基础系列,0.5-2RCU 50GB
云数据库 RDS MySQL,高可用系列 2核4GB
简介: [MySQL] 二进制文件


日志文件

binlog

是什么

简介

The binary log contains “events” that describe database changes such as table creation operations or changes to table data. It also contains events for statements that potentially could have made changes (for example, a DELETE which matched no rows), unless row-based logging is used. The binary log also contains information about how long each statement took that updated data. The binary log has two important purposes:

  • For replication, the binary log on a replication source server provides a record of the data changes to be sent to replicas. The source sends the information contained in its binary log to its replicas, which reproduce those transactions to make the same data changes that were made on the source. See Section 17.2, “Replication Implementation”.
  • Certain data recovery operations require use of the binary log. After a backup has been restored, the events in the binary log that were recorded after the backup was made are re-executed. These events bring databases up to date from the point of the backup. See Section 7.5, “Point-in-Time (Incremental) Recovery”.

The binary log is not used for statements such as SELECT or SHOW that do not modify data. To log all statements (for example, to identify a problem query), use the general query log. See Section 5.4.3, “The General Query Log”.

二进制文件,通常用于备份与恢复。

The binary log is resilient to unexpected halts. Only complete events or transactions are logged or read back.

产生方式

mysqld appends a numeric extension to the binary log base name to generate binary log file names. The number increases each time the server creates a new log file, thus creating an ordered series of files. The server creates a new file in the series each time any of the following events occurs:

  • The server is started or restarted
  • The server flushes the logs.
  • The size of the current log file reaches max_binlog_size.

A binary log file may become larger than max_binlog_size if you are using large transactions because a transaction is written to the file in one piece, never split between files.

binlog文件生成的时机:

  • 服务器已启动或重新启动
  • 服务器刷新日志。
  • 当前日志文件的大小达到 max_binlog_size.(当然文件也可能比文件限制大,比如说一个大事务的数据插入,因为事务数据不允许分离)

To keep track of which binary log files have been used, mysqld also creates a binary log index file that contains the names of the binary log files. By default, this has the same base name as the binary log file, with the extension ‘.index’. You can change the name of the binary log index file with the –log-bin-index[=file_name] option. You should not manually edit this file while mysqld is running; doing so would confuse mysqld.

日志索引文件,不建议在 mysql运行时修改,这样会导致异常。

By default, the server logs the length of the event as well as the event itself and uses this to verify that the event was written correctly. You can also cause the server to write checksums for the events by setting the binlog_checksum system variable. When reading back from the binary log, the source uses the event length by default, but can be made to use checksums if available by enabling the system variable source_verify_checksum (from MySQL 8.0.26) or master_verify_checksum (before MySQL 8.0.26). The replication I/O (receiver) thread on the replica also verifies events received from the source. You can cause the replication SQL (applier) thread to use checksums if available when reading from the relay log by enabling the system variable replica_sql_verify_checksum (from MySQL 8.0.26) or slave_sql_verify_checksum (before MySQL 8.0.26).

可以开启校验和校验文件完整性。

The format of the events recorded in the binary log is dependent on the binary logging format. Three format types are supported: row-based logging, statement-based logging and mixed-base logging. The binary logging format used depends on the MySQL version. For general descriptions of the logging formats, see Section 5.4.4.1, “Binary Logging Formats”. For detailed information about the format of the binary log, see MySQL Internals: The Binary Log.

文件格式

The server uses several logging formats to record information in the binary log:

  • Replication capabilities in MySQL originally were based on propagation of SQL statements from source to replica. This is called statement-based logging. You can cause this format to be used by starting the server with --binlog-format=STATEMENT.
  • In row-based logging (the default), the source writes events to the binary log that indicate how individual table rows are affected. You can cause the server to use row-based logging by starting it with --binlog-format=ROW.
  • A third option is also available: mixed logging. With mixed logging, statement-based logging is used by default, but the logging mode switches automatically to row-based in certain cases as described below. You can cause MySQL to use mixed logging explicitly by starting mysqld with the option --binlog-format=MIXED.

文件格式主要有三种,基于语句、基于数据行、混合模式

  • STATEMENT 导致日志记录是基于语句的。
  • ROW 导致日志记录基于行。这是默认值。
  • MIXED 导致日志记录使用混合格式。

The logging format can also be set or limited by the storage engine being used. This helps to eliminate issues when replicating certain statements between a source and replica which are using different storage engines.

With statement-based replication, there may be issues with replicating nondeterministic statements. In deciding whether or not a given statement is safe for statement-based replication, MySQL determines whether it can guarantee that the statement can be replicated using statement-based logging. If MySQL cannot make this guarantee, it marks the statement as potentially unreliable and issues the warning, Statement may not be safe to log in statement format.

You can avoid these issues by using MySQL’s row-based replication instead.

日志记录格式也可以由正在使用的存储引擎设置或限制。这有助于消除在使用不同存储引擎的源和副本之间复制某些语句时出现的问题。

对于基于语句的复制,复制非确定性语句可能会出现问题。在决定给定语句对于基于语句的复制是否安全时,MySQL 确定它是否可以保证可以使用基于语句的日志记录来复制该语句。如果 MySQL 不能做出这种保证,它会将语句标记为潜在不可靠并发出警告。您可以改用 MySQL 的基于行的复制来避免这些问题。

statement
row

With the binary log format set to ROW, many changes are written to the binary log using the row-based format. Some changes, however, still use the statement-based format. Examples include all DDL (data definition language) statements such as CREATE TABLE, ALTER TABLE, or DROP TABLE.

When row-based binary logging is used, the binlog_row_event_max_size system variable and its corresponding startup option –binlog-row-event-max-size set a soft limit on the maximum size of row events. The default value is 8192 bytes, and the value can only be changed at server startup. Where possible, rows stored in the binary log are grouped into events with a size not exceeding the value of this setting. If an event cannot be split, the maximum size can be exceeded.

将二进制日志格式设置为ROW,许多更改将使用基于行的格式写入二进制日志。但是,一些更改仍然使用基于语句的格式。示例包括所有 DDL(数据定义语言)语句,例如 CREATE TABLE、 ALTER TABLE或 DROP TABLE

使用基于行的二进制日志记录时, binlog_row_event_max_size 系统变量及其相应的启动选项 --binlog-row-event-max-size 对行事件的最大大小设置软限制。默认值为 8192 字节,该值只能在服务器启动时更改。在可能的情况下,存储在二进制日志中的行被分组为大小不超过此设置值的事件。如果无法拆分事件,则可能会超过最大大小。

mixed

怎么办

You can display the contents of binary log files with the mysqlbinlog utility. This can be useful when you want to reprocess statements in the log for a recovery operation. For example, you can update a MySQL server from the binary log as follows:

mysqlbinlog log_file | mysql -h server_name

mysqlbinlog also can be used to display the contents of the relay log file on a replica, because they are written using the same format as binary log files. For more information on the mysqlbinlog utility and how to use it, see Section 4.6.9, “mysqlbinlog — Utility for Processing Binary Log Files”. For more information about the binary log and recovery operations, see Section 7.5, “Point-in-Time (Incremental) Recovery”.

可以使用 mysqlbinlog程序查询二进制文件内容。

Binary logging is done immediately after a statement or transaction completes but before any locks are released or any commit is done. This ensures that the log is logged in commit order.

Within an uncommitted transaction, all updates (UPDATE, DELETE, or INSERT) that change transactional tables such as InnoDB tables are cached until a COMMIT statement is received by the server. At that point, mysqld writes the entire transaction to the binary log before the COMMIT is executed.

Modifications to nontransactional tables cannot be rolled back. If a transaction that is rolled back includes modifications to nontransactional tables, the entire transaction is logged with a ROLLBACK statement at the end to ensure that the modifications to those tables are replicated.

二进制日志记录在语句或事务完成后但在释放任何锁或完成任何提交之前立即完成。这确保日志以提交顺序记录。

对非事务性表的更新在执行后立即存储在二进制日志中。

在未提交的事务中,所有更改事务表(例如表)的 更新(UPDATEDELETE或 )都会被缓存,直到 服务器收到一条语句。此时,mysqld在执行之前将整个事务写入二进制日志 。 INSERTInnoDBCOMMITCOMMIT

不能回滚对非事务性表的修改。如果回滚的事务包括对非事务表的修改,则整个事务会 ROLLBACK 在末尾记录一条语句,以确保复制对这些表的修改。

When a thread that handles the transaction starts, it allocates a buffer of binlog_cache_size to buffer statements. If a statement is bigger than this, the thread opens a temporary file to store the transaction. The temporary file is deleted when the thread ends. From MySQL 8.0.17, if binary log encryption is active on the server, the temporary file is encrypted.

处理 sql语句时,语句会被放到一个 缓冲区,如果语句大小大于 binlog_cache_size时,语句会被存储到一个临时文件,处理完之后,临时文件被删除。

设置文件存储格式
  • 全局
mysql> SET GLOBAL binlog_format = 'STATEMENT';
mysql> SET GLOBAL binlog_format = 'ROW';
mysql> SET GLOBAL binlog_format = 'MIXED';
  • 会话
mysql> SET SESSION binlog_format = 'STATEMENT';
mysql> SET SESSION binlog_format = 'ROW';
mysql> SET SESSION binlog_format = 'MIXED';
  • 为什么会需要会话设置

客户端可能希望在每个会话的基础上设置二进制日志记录的原因有多种:

  • 对数据库进行许多小更改的会话可能需要使用基于行的日志记录。
  • 执行与子句中的许多行匹配的更新的会话 WHERE 可能希望使用基于语句的日志记录,因为记录少量语句比记录许多行更有效。
  • 有些语句在源上需要大量执行时间,但只会修改几行。因此,使用基于行的日志记录来复制它们可能是有益的。

If you are using InnoDB tables and the transaction isolation level is READ COMMITTED or READ UNCOMMITTED, only row-based logging can be used. It is possible to change the logging format to STATEMENT, but doing so at runtime leads very rapidly to errors because InnoDB can no longer perform inserts.

如果您正在使用 InnoDB 表并且事务隔离级别为 READ COMMITTEDREAD UNCOMMITTED,则只能使用基于行的日志记录。

缓冲区大小的调整方式

The Binlog_cache_use status variable shows the number of transactions that used this buffer (and possibly a temporary file) for storing statements. The Binlog_cache_disk_use status variable shows how many of those transactions actually had to use a temporary file. These two variables can be used for tuning binlog_cache_size to a large enough value that avoids the use of temporary files.

The max_binlog_cache_size system variable (default 4GB, which is also the maximum) can be used to restrict the total size used to cache a multiple-statement transaction. If a transaction is larger than this many bytes, it fails and rolls back. The minimum value is 4096.

默认 4GB

If you are using the binary log and row based logging, concurrent inserts are converted to normal inserts for CREATE … SELECT or INSERT … SELECT statements. This is done to ensure that you can re-create an exact copy of your tables by applying the log during a backup operation. If you are using statement-based logging, the original statement is written to the log.

如果您使用二进制日志和基于行的日志记录,并发插入将转换为普通插入 CREATE ... SELECTINSERT ... SELECT 语句。这样做是为了确保您可以通过在备份操作期间应用日志来重新创建表的精确副本。如果您使用的是基于语句的日志记录,则原始语句将写入日志。

缓冲区写入文件的时机 sync_binlog

By default, the binary log is synchronized to disk at each write (sync_binlog=1). If sync_binlog was not enabled, and the operating system or machine (not only the MySQL server) crashed, there is a chance that the last statements of the binary log could be lost. To prevent this, enable the sync_binlog system variable to synchronize the binary log to disk after every N commit groups. See Section 5.1.8, “Server System Variables”. The safest value for sync_binlog is 1 (the default), but this is also the slowest.

sync_binlog默认是 1,即每个提交都写入文件,可以设置为 N表示 Ncommit再写入到文件。

In earlier MySQL releases, there was a chance of inconsistency between the table content and binary log content if a crash occurred, even with sync_binlog set to 1. For example, if you are using InnoDB tables and the MySQL server processes a COMMIT statement, it writes many prepared transactions to the binary log in sequence, synchronizes the binary log, and then commits the transaction into InnoDB. If the server unexpectedly exited between those two operations, the transaction would be rolled back by InnoDB at restart but still exist in the binary log. Such an issue was resolved in previous releases by enabling InnoDB support for two-phase commit in XA transactions. In MySQL 8.0, InnoDB support for two-phase commit in XA transactions is always enabled.

在早期的 MySQL 版本中,如果发生崩溃,表内容和二进制日志内容之间有可能不一致,即使 sync_binlog 设置为 1。例如,如果您正在使用InnoDB 表并且 MySQL 服务器处理一条 COMMIT 语句,它会写入许多准备好的事务顺序写入二进制日志,同步二进制日志,然后将事务提交到 InnoDB. 如果服务器在这两个操作之间意外退出,事务将在重启时回滚InnoDB,但仍然存在于二进制日志中。InnoDB通过在 XA 事务中启用对两阶段提交的支持,在以前的版本中解决了这样的问题。在 MySQL 8.0 中, InnoDB始终启用 XA 事务中对两阶段提交的支持。

InnoDB support for two-phase commit in XA transactions ensures that the binary log and InnoDB data files are synchronized. However, the MySQL server should also be configured to synchronize the binary log and the InnoDB logs to disk before committing the transaction. The InnoDB logs are synchronized by default, and sync_binlog=1 ensures the binary log is synchronized. The effect of implicit InnoDB support for two-phase commit in XA transactions and sync_binlog=1 is that at restart after a crash, after doing a rollback of transactions, the MySQL server scans the latest binary log file to collect transaction xid values and calculate the last valid position in the binary log file. The MySQL server then tells InnoDB to complete any prepared transactions that were successfully written to the to the binary log, and truncates the binary log to the last valid position. This ensures that the binary log reflects the exact data of InnoDB tables, and therefore the replica remains in synchrony with the source because it does not receive a statement which has been rolled back.

InnoDB支持 XA 事务中的两阶段提交确保二进制日志和 InnoDB数据文件同步。但是,MySQL服务器还应该配置为InnoDB在提交事务之前将二进制日志和日志同步到磁盘。InnoDB默认同步日志,保证 sync_binlog=1 二进制日志同步。XA 事务中隐式 InnoDB支持两阶段提交的效果 sync_binlog=1 是,在崩溃后重新启动时,在事务回滚后,MySQL 服务器扫描最新的二进制日志文件以收集事务 xid 值并计算最后一个有效位置在二进制日志文件。MySQL 服务器然后告诉InnoDB完成任何已成功写入二进制日志的准备事务,并将二进制日志截断到最后一个有效位置。这确保二进制日志反映 InnoDB表的准确数据,因此副本与源保持同步,因为它没有收到已回滚的语句。

If the MySQL server discovers at crash recovery that the binary log is shorter than it should have been, it lacks at least one successfully committed InnoDB transaction. This should not happen if sync_binlog=1 and the disk/file system do an actual sync when they are requested to (some do not), so the server prints an error message The binary log file_name is shorter than its expected size. In this case, this binary log is not correct and replication should be restarted from a fresh snapshot of the source’s data.

如果 MySQL 服务器在崩溃恢复时发现二进制日志比它应该的短,它至少缺少一个成功提交的InnoDB事务。sync_binlog=1 如果磁盘/文件系统在请求时进行了实际同步(有些则没有),则不应发生这种情况,因此服务器会打印一条错误消息。在这种情况下,此二进制日志不正确,应从源数据的新快照重新启动复制。

相关实践学习
如何在云端创建MySQL数据库
开始实验后,系统会自动创建一台自建MySQL的 源数据库 ECS 实例和一台 目标数据库 RDS。
全面了解阿里云能为你做什么
阿里云在全球各地部署高效节能的绿色数据中心,利用清洁计算为万物互联的新世界提供源源不断的能源动力,目前开服的区域包括中国(华北、华东、华南、香港)、新加坡、美国(美东、美西)、欧洲、中东、澳大利亚、日本。目前阿里云的产品涵盖弹性计算、数据库、存储与CDN、分析与搜索、云通信、网络、管理与监控、应用服务、互联网中间件、移动服务、视频服务等。通过本课程,来了解阿里云能够为你的业务带来哪些帮助     相关的阿里云产品:云服务器ECS 云服务器 ECS(Elastic Compute Service)是一种弹性可伸缩的计算服务,助您降低 IT 成本,提升运维效率,使您更专注于核心业务创新。产品详情: https://www.aliyun.com/product/ecs
相关文章
|
5天前
|
存储 Oracle 关系型数据库
【赵渝强老师】MySQL InnoDB的数据文件与重做日志文件
本文介绍了MySQL InnoDB存储引擎中的数据文件和重做日志文件。数据文件包括`.ibd`和`ibdata`文件,用于存放InnoDB数据和索引。重做日志文件(redo log)确保数据的可靠性和事务的持久性,其大小和路径可由相关参数配置。文章还提供了视频讲解和示例代码。
105 11
【赵渝强老师】MySQL InnoDB的数据文件与重做日志文件
|
1月前
|
SQL 关系型数据库 MySQL
|
29天前
|
存储 关系型数据库 MySQL
PACS系统 中 dicom 文件在mysql 8.0 数据库中的 存储和读取(pydicom 库使用)
PACS系统 中 dicom 文件在mysql 8.0 数据库中的 存储和读取(pydicom 库使用)
25 2
|
1月前
|
SQL 存储 关系型数据库
SQL文件导入MySQL数据库的详细指南
数据库中的数据转移是一项常规任务,无论是在数据迁移过程中,还是在数据备份、还原场景中,导入导出SQL文件显得尤为重要。特别是在使用MySQL数据库时,如何将SQL文件导入数据库是一项基本技能。本文将详细介绍如何将SQL文件导入MySQL数据库,并提供一个清晰、完整的步骤指南。这篇文章的内容字数大约在
110 1
|
2月前
|
SQL 关系型数据库 MySQL
Python小技巧——将CSV文件导入到MySQL数据库
Python小技巧——将CSV文件导入到MySQL数据库
43 0
|
3月前
|
存储 SQL 关系型数据库
mysql体系结构及主要文件
了解MySQL的体系结构和它的主要文件,能够帮助数据库管理员和开发者更好地管理和优化数据库。这对于数据库的正常运行、性能优化、数据恢复和系统维护来说是至关重要的。通过对这些组件的深入理解,可以更容易地解决数据库问题,并在必要时定制或调整数据库的行为以满足特定的应用需求。
60 0
|
3月前
|
SQL 关系型数据库 MySQL
【MySQL核心】MySQL无法启动?批量恢复MySQL 物理文件-拯救即将跑路的你
【MySQL核心】MySQL无法启动?批量恢复MySQL 物理文件-拯救即将跑路的你
|
3月前
|
关系型数据库 MySQL Windows
MySQL数据导入:MySQL 导入 Excel 文件.md
MySQL数据导入:MySQL 导入 Excel 文件.md
|
4月前
|
存储 关系型数据库 MySQL
面试题MySQL问题之FastDFS中的文件重复上传如何解决
面试题MySQL问题之FastDFS中的文件重复上传如何解决
43 1
|
4月前
|
关系型数据库 MySQL Java