我也是只菜鸡,blog写的不对或者不严谨的地方还请大伙指出来,我及时改正,免得误人子弟。
实验环境:
CentOS7.3.1611 + MySQL社区版 5.7.19
参考:
小菜鸟DBA的微信公众号推送
官方文档:
https://dev.mysql.com/doc/internals/en/binary-log-versions.html
https://dev.mysql.com/doc/internals/en/row-based-binary-logging.html
https://dev.mysql.com/doc/internals/en/event-classes-and-types.html
https://dev.mysql.com/doc/internals/en/event-header-fields.html
https://dev.mysql.com/doc/internals/en/event-meanings.html
https://dev.mysql.com/doc/internals/en/event-data-for-specific-event-types.html
3个在线工具:
http://tool.oschina.net/hexconvert/ 在线进制转换
http://tool.chinaz.com/Tools/unixtime.aspx Unix时间戳
https://www.bejson.com/convert/ox2str/ 16进制转字符串
binlog实际上由一个个不同类型的binlog event组成,每个binlog event还包含了event header部分和event data部分(可选)。
【注意:每个event最后还有4bytes的校验位,官方文档并没有提到这个地方,不然分析event物理格式时候会发现event长度对不上号】
常见的一个binlog物理文件有如下组成部分:
1、4字节的magic number作为binlog文件的开头
2、N个不同类型的binlog event
3、rotate event 作为binlog文件的结尾(正在使用的binlog里面是没有rotate event的)
此外,还有一个索引文件记录当前有哪些binlog文件,及当前正在使用的binlog文件。(文件名类似:mysql-bin.index)
下表就是的binlog event的一般格式:
+=====================================+
| event | timestamp 0 : 4 |
| header +----------------------------+
| | type_code 4 : 1 | = FORMAT_DESCRIPTION_EVENT = 15(binlog v4)
| +----------------------------+
| | server_id 5 : 4 |
| +----------------------------+
| | event_length 9 : 4 | >= 91
| +----------------------------+
| | next_position 13 : 4 |
| +----------------------------+
| | flags 17 : 2 |
+=====================================+
| event | binlog_version 19 : 2 | = 4
| data +----------------------------+
| | server_version 21 : 50 |
| +----------------------------+
| | create_timestamp 71 : 4 |
| +----------------------------+
| | header_length 75 : 1 |
| +----------------------------+
| | post-header 76 : n | = array of n bytes, one byte per event
| | lengths for all | type that the server knows about
| | event types |
+=====================================+
常用的EVENT如下:
FORMAT_DESCRIPTION_EVENT:binlog文件的第一个event,记录版本号等元数据信息
QUERY_EVENT: 存储statement类的信息,基于statement的binlog格式记录sql语句,在row模式下记录事务begin标签
XID_EVENT: 二阶段提交xid记录
TABLE_MAP_EVENT: row模式下记录表源数据,对读取行记录提供规则参考,后面会详细介绍
WRITE_ROWS_EVENT/DELETE_ROWS_EVENT/UPDATE_ROWS_EVENT: row模式下记录对应行数据变化的记录
GTID_LOG_EVENT: 这个就是记录GTID事务号了,用于5.6版本之后基于GTID同步的方式
ROTATE_EVENT: 连接下一个binlog文件
需要了解更全面的Event类型详见: https://dev.mysql.com/doc/internals/en/event-classes-and-types.html
下面是我截取的一个完整的binlog文件,具体的events如下:
目前,我们一般都是使用row格式的binlog,其他的mixed和statement格式的binlog这里不去关注了。
对于row格式的DML操作而言,实际上在binlog里面记录的是:TABLE_MAP_EVENT+ ROW_LOG_EVENT(ROW_LOG_EVENT还可以细分为WRITE_ROWS_EVENT、UPDATE_ROWS_EVENT、DELETE_ROWS_EVENT)
为什么一个update在ROW模式下需要分解成两个event:一个Table_map,一个Update_rows?
我们想象一下,一个update如果更新了10000条数据,那么对应的表结构信息是否需要记录10000次?其实是对同一个表的操作,所以这里binlog只是记录了一个Table_map用于记录表结构相关信息,而后面的Update_rows记录了更新数据的行信息。他们之间是通过table_id来联系的。【table_id不是固定的,是一个变量,占用的是table_definition_cache和table_open_cache空间(因此flush tables会造成table_id的增长)】
如下是一个insert插入1条记录的binlog,可以看到有table_map+ write_rows 这2个event组成。
table_map记录的是表的元数据信息,例如库名、表名、字段类型等信息。
补充:
关于table_id的几篇干货:
http://blog.itpub.net/22664653/viewspace-1158547/ 【杨奇龙】
http://agapple.iteye.com/blog/1797061
http://www.cnblogs.com/yuyue2014/p/3721172.html
http://www.sohu.com/a/130698375_610509 【宋利兵】
http://www.cnblogs.com/cenalulu/archive/2012/09/24/2699907.html 【卢钧轶】
其他的几个EVENT类型:
官方文档:https://dev.mysql.com/doc/internals/en/event-data-for-specific-event-types.html
FORMAT_DESCRIPTION_EVENT
这个是最基础的event,每个新的binlog头部就带有这个event。每一个binlog文件只能存在一个FORMAT_DESCRIPTION_EVENT。
Fixed data part:
2 bytes. The binary log format version. This is 4 in MySQL 5.0 and up.
50 bytes. The MySQL server's version (example: 5.0.14-debug-log), padded with 0x00 bytes on the right.
4 bytes. Timestamp in seconds when this event was created (this is the moment when the binary log was created). This value is redundant; the same value occurs in the timestamp header field.
1 byte. The header length. This length - 19 gives the size of the extra headers field at the end of the header for other events.
Variable-sized. An array that indicates the post-header lengths for all event types. There is one byte per event type that the server knows about.
FORMAT_DESCRIPTION_EVENT 实例:
flush logs; 产生一个全新的binlog文件,导出后如下:
master [localhost] {root} ((none)) > show binlog events in 'mysql-bin.000002';
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
| mysql-bin.000002 | 4 | Format_desc | 1 | 120 | Server ver: 5.6.37-log, Binlog ver: 4 |
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
[root@test_mysql26 /root/sandboxes/rsandbox_5_6_37/master/data ]# hexdump -C mysql-bin.000002
00000000 fe 62 69 6e f6 e3 fe 59 0f 01 00 00 00 74 00 00 |.bin...Y.....t..|
00000010 00 78 00 00 00 01 00 04 00 35 2e 36 2e 33 37 2d |.x.......5.6.37-|
00000020 6c 6f 67 00 00 00 00 00 00 00 00 00 00 00 00 00 |log.............|
00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 13 |................|
00000050 38 0d 00 08 00 12 00 04 04 04 04 12 00 00 5c 00 |8.............\.|
00000060 04 1a 08 00 00 00 08 08 08 02 00 00 00 0a 0a 0a |................|
00000070 19 19 00 01 08 4c 67 48 |.....LgH|
00000078
magic number (4bytes)
fe 62 69 6e
event header (19bytes)
f6 e3 fe 59 timestamp
0f type_code 表示binlog采用v4版本的
01 00 00 00 server_id
74 00 00 00 event_length 116bytes
78 00 00 00 next_position 下一个event从120开始
01 00 flags
event data:
04 00 binlog version ,表示v4版的binlog格式
35 2e 36 2e 33 37 2d 6c 6f 67 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 表示的是server_version ,转换成字符串就是5.6.37-log
00 00 00 00 create_timestamp ,用的是相对时间
13 表示的是event header的长度,十进制表示就是19bytes
38 0d 00 08 00 12 00 04 04 04 04 12 00 00 5c 00 04 1a 08 00 00 00 08 08 08 02 00 00 00 0a 0a 0a 19 19 00 01 36种event 类型
08 4c 67 48 4bytes校验位。
36种event类型:https://dev.mysql.com/doc/internals/en/event-classes-and-types.html
STOP_EVENT:
当正常关闭mysqld时候,或者是从库上执行了reset slave 都 产生这个stop_event
A Stop_log_event is written under these circumstances:
A master writes the event to the binary log when it shuts down
A slave writes the event to the relay log when it shuts down or when a RESET SLAVE statement is executed
STOP_EVENT实例:
/etc/init.d/mysqld restart
master [localhost] {root} ((none)) > show binlog events in 'mysql-bin.000002';
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
| mysql-bin.000002 | 4 | Format_desc | 1 | 120 | Server ver: 5.6.37-log, Binlog ver: 4 |
| mysql-bin.000002 | 120 | Stop | 1 | 143 | |
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
2 rows in set (0.00 sec)
[root@test_mysql26 /root/sandboxes/rsandbox_5_6_37/master/data ]# hexdump -C mysql-bin.000002 -s 120
00000078 f9 f0 fe 59 03 01 00 00 00 17 00 00 00 8f 00 00 |...Y............|
00000088 00 00 00 39 d3 4f ad |...9.O.|
0000008f
含义未知。官方没有说。
QYERY_EVENT:
使用begin命令开启一个事务的时候,会产生QUERY_EVENT
固定部分:
4bytes thread_id 可以用于审计
4bytes 该语句的执行时长,单位秒
1byte 执行命令时候所在的库名的字节长度
2bytes 错误代码
2bytes 记录data part部分variable status的长度
可变部分:
0或者更多状态变量
默认的库名
SQL_Statement
master [localhost] {root} (test) > begin;
master [localhost] {root} (test) > insert into tttt2 select 'AAAA';
master [localhost] {root} (test) > commit;
master [localhost] {root} (test) > show binlog events in 'mysql-bin.000001';
+------------------+-----+-------------+-----------+-------------+---------------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+-----+-------------+-----------+-------------+---------------------------------------------+
| mysql-bin.000001 | 4 | Format_desc | 1 | 120 | Server ver: 5.6.37-log, Binlog ver: 4 |
| mysql-bin.000001 | 120 | Query | 1 | 199 | BEGIN |
| mysql-bin.000001 | 199 | Query | 1 | 304 | use `test`; insert into tttt2 select 'AAAA' |
| mysql-bin.000001 | 304 | Xid | 1 | 335 | COMMIT /* xid=40 */ |
+------------------+-----+-------------+-----------+-------------+---------------------------------------------+
4 rows in set (0.00 sec)
[root@test_mysql26 /root/sandboxes/rsandbox_5_6_37/master/data ]# hexdump -C mysql-bin.000001 -s 120
00000078 de f3 fe 59 02 01 00 00 00 4f 00 00 00 c7 00 00 |...Y.....O......|
00000088 00 08 00 01 00 00 00 00 00 00 00 04 00 00 21 00 |..............!.|
00000098 00 00 00 00 00 01 00 00 00 40 00 00 00 00 06 03 |.........@......|
000000a8 73 74 64 04 21 00 21 00 08 00 0c 01 74 65 73 74 |std.!.!.....test|
000000b8 00 74 65 73 74 00 42 45 47 49 4e 37 1f 09 57 de |.test.BEGIN7..W.|
000000c8 f3 fe 59 02 01 00 00 00 69 00 00 00 30 01 00 00 |..Y.....i...0...|
000000d8 00 00 01 00 00 00 00 00 00 00 04 00 00 21 00 00 |.............!..|
000000e8 00 00 00 00 01 00 00 00 40 00 00 00 00 06 03 73 |........@......s|
000000f8 74 64 04 21 00 21 00 08 00 0c 01 74 65 73 74 00 |td.!.!.....test.|
00000108 74 65 73 74 00 69 6e 73 65 72 74 20 69 6e 74 6f |test.insert into|
00000118 20 74 74 74 74 32 20 73 65 6c 65 63 74 20 27 41 | tttt2 select 'A|
00000128 41 41 41 27 73 f4 e2 90 e0 f3 fe 59 10 01 00 00 |AAA's......Y....|
00000138 00 1f 00 00 00 4f 01 00 00 00 00 28 00 00 00 00 |.....O.....(....|
00000148 00 00 00 17 4f 16 46 |....O.F|
0000014f
.... 这个类型的event分析卡住了,谁来帮我下。。。。。。。。。 【参考http://www.jianshu.com/p/c16686b35807】
ROTATE_EVENT:
当flush logs或者正常的切割binlog时候,会产生ROTATE_EVENT。
When a binary log file exceeds a size limit, a ROTATE_EVENT is written at the end of the file that points to the next file in the squence. This event is information for the slave to know the name of the next binary log it is going to receive.
Fixed data part:
8 bytes. The position of the first event in the next log file. Always contains the number 4 (meaning the next event starts at position 4 in the next binary log). This field is not present in v1; presumably the value is assumed to be 4.
Variable data part:
The name of the next binary log. The filename is not null-terminated. Its length is the event size minus the size of the fixed parts.
XID_EVENT:
为了事务的一致性,写binlog的时候,先写事务的语句,然后写xid标志,最后才是提交COMMIT命令。
Fixed part为空
variable part 8bytes,记录的是xid编号
关于rotate event的例子:
master [localhost] {root} ((none)) > show binlog events in 'mysql-bin.000001';
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
| Log_name | Pos | Event_type | Server_id | End_log_pos | Info |
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
| mysql-bin.000001 | 4 | Format_desc | 1 | 120 | Server ver: 5.6.37-log, Binlog ver: 4 |
| mysql-bin.000001 | 120 | Rotate | 1 | 167 | mysql-bin.000002;pos=4 |
+------------------+-----+-------------+-----------+-------------+---------------------------------------+
2 rows in set (0.00 sec)
[root@test_mysql26 /root/sandboxes/rsandbox_5_6_37/master/data ]# hexdump -C mysql-bin.000001 -s 4 -n 19 导出format desc event的内容
00000004 e2 e3 fe 59 0f 01 00 00 00 74 00 00 00 78 00 00 |...Y.....t...x..|
00000014 00 00 00 |...|
00000017
00 00 最后2位都是0000表示这个binlog关闭了,如果是01 00表示这个binlog还在使用中。
[root@test_mysql26 /root/sandboxes/rsandbox_5_6_37/master/data ]# hexdump -C mysql-bin.000001 -s 120 导出rotate event的内容
00000078 f6 e3 fe 59 04 01 00 00 00 2f 00 00 00 a7 00 00 |...Y...../......|
00000088 00 00 00 04 00 00 00 00 00 00 00 6d 79 73 71 6c |...........mysql|
00000098 2d 62 69 6e 2e 30 30 30 30 30 32 ce 7f 95 b8 |-bin.000002....|
000000a7
event_header(19bytes)具体如下:
f6 e3 fe 59 timestamp
04 type_code 其event type是0x04,表示是一个rotate event
01 00 00 00 server_id 表示serverid是 1
2f 00 00 00 event_length 长度为2f,十进制表示就是47bytes
a7 00 00 00 next_position 下一个event起始位置为0xa7,十进制表示就是167
00 00 flags 0000表示这个binlog已经正常关闭了
然后是event data部分,具体如下:
04 00 00 00 00 00 00 00 Fixed data部分,8bytes,记录的是下一个binlog的位置偏移4
6d 79 73 71 6c 2d 62 69 6e 2e 30 30 30 30 30 32 Variable data部分,记录的是下一个binlog的文件名mysql-bin.000002
ce 7f 95 b8 含义未知
TABLE_MAP_EVENT:
Used for row-based binary logging beginning with MySQL 5.1.5.
Fixed data part:
6 bytes. The table ID.
2 bytes. Reserved for future use.
Variable data part:
1 byte. The length of the database name. 库名的长度
Variable-sized. The database name (null-terminated). 库名
1 byte. The length of the table name. 表名的长度
Variable-sized. The table name (null-terminated). 表名
Packed integer. The number of columns in the table. 列的数量
Variable-sized. An array of column types, one byte per column. To find the meanings of these values, look atenum_field_types in the mysql_com.h header file. 每一列的数据类型
Packed integer. The length of the metadata block. metadata block的长度
Variable-sized. The metadata block; see log_event.h for contents and format.
Variable-sized. Bit-field indicating whether each column can be NULL, one bit per column. For this field, the amount of storage required for N columns is INT((N+7)/8) bytes. 该部分记录字段是否允许为空,一位代表一个字段,占用字int((N+7/8))bytes,N为字段数
WRITE_ROWS_EVENT、DELETE_ROW_EVENTS、UPDATE_ROW_EVENTS 都参考如下这种解释:
Used for row-based binary logging beginning with MySQL 5.1.18.
[TODO: following needs verification; it's guesswork]
Fixed data part:
6 bytes. The table ID.
2 bytes. Reserved for future use.
Variable data part:
Packed integer. The number of columns in the table. 列的数量
Variable-sized. Bit-field indicating whether each column is used, one bit per column. For this field, the amount of storage required for N columns is INT((N+7)/8) bytes.
Variable-sized (for UPDATE_ROWS_LOG_EVENT only). Bit-field indicating whether each column is used in theUPDATE_ROWS_LOG_EVENT after-image; one bit per column. For this field, the amount of storage required for N columns is INT((N+7)/8) bytes.
Variable-sized. A sequence of zero or more rows. The end is determined by the size of the event. Each row has the following format:
Variable-sized. Bit-field indicating whether each field in the row is NULL. Only columns that are "used" according to the second field in the variable data part are listed here. If the second field in the variable data part has N one-bits, the amount of storage required for this field is INT((N+7)/8) bytes.
Variable-sized. The row-image, containing values of all table fields. This only lists table fields that are used (according to the second field of the variable data part) and non-NULL (according to the previous field). In other words, the number of values listed here is equal to the number of zero bits in the previous field (not counting padding bits in the last byte).
The format of each value is described in the log_event_print_value() function in log_event.cc.(for UPDATE_ROWS_EVENT only) the previous two fields are repeated, representing a second table row.
For each row, the following is done:
For WRITE_ROWS_LOG_EVENT, the row described by the row-image is inserted.
For DELETE_ROWS_LOG_EVENT, a row matching the given row-image is deleted.
For UPDATE_ROWS_LOG_EVENT, a row matching the first row-image is removed, and the row described by the second row-image is inserted.
HEARTBEAT_LOG_EVENT:
A Heartbeat_log_event is sent by a master to a slave to let the slave know that the master is still alive. Events of this type do not appear in the binary or relay logs. They are generated on a master server by the thread that dumps events and sent straight to the slave without ever being written to the binary log. The slave examines the event contents and then discards it without writing it to the relay log.
主从心跳的heartbeat event是不写到binlog里面的。Sent by a master to a slave to let the slave know that the master is still alive. Not written to log files.