1. 问题描述
2. 问题探究
3. 总结
4. 延伸阅读
1. 问题描述
前几天,QQ群里在讨论一个关于MySQL表统计信息迟迟不更新的问题。
这个问题我复现了,下面是详细过程:
# 创建一个空表 [root@yejr.run]>create table ttxx like t1; # 第一次执行 show table status,看到 Rows = 0,没问题 [root@yejr.run] [test]>show table status like 'ttxx'\G *************************** 1. row *************************** Name: ttxx Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 0 Avg_row_length: 0 Data_length: 16384 Max_data_length: 0 Index_length: 16384 Data_free: 0 Auto_increment: 1 Create_time: 2020-06-04 16:17:54 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment: 1 row in set (0.00 sec) # 写入将近80万条数据 [root@yejr.run]>insert into ttxx select id,name,c1 from t1; Query OK, 799994 rows affected (8.25 sec) Records: 799994 Duplicates: 0 Warnings: 0 # 再次执行 show table status,发现 Rows 值还是 0,并且过了几秒钟后多执行几次,结果依然如此 [root@yejr.run] [test]>show table status like 'ttxx'\G *************************** 1. row *************************** Name: ttxx Engine: InnoDB Version: 10 Row_format: Dynamic Rows: 0 Avg_row_length: 0 Data_length: 16384 Max_data_length: 0 Index_length: 16384 Data_free: 0 Auto_increment: 1 Create_time: 2020-06-04 16:17:54 Update_time: NULL Check_time: NULL Collation: utf8_general_ci Checksum: NULL Create_options: Comment:
简言之,就是执行 SHOW TABLE STATUS
时无法及时查看到该表的最新统计信息。
但与此同时,直接查看 mysql.innodb_table_stats
和 mysql.innodb_index_stats
两个表,却又可以看到该表的统计信息已经更新了:
[root@yejr.run] [test]>select * from mysql.innodb_table_stats where database_name ='test' and table_name ='ttxx';
+---------------+------------+---------------------+--------+----------------------+--------------------------+| database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| test | ttxx | 2020-06-04 16:18:24 | 795064 | 2788 | 0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
[root@yejr.run] [test]>select * from mysql.innodb_index_stats where database_name ='test' and table_name ='ttxx';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| test | ttxx | PRIMARY | 2020-06-04 16:18:24 | n_diff_pfx01 | 795064 | 20 | aid |
| test | ttxx | PRIMARY | 2020-06-04 16:18:24 | n_leaf_pages | 2764 | NULL | Number of leaf pages in the index |
| test | ttxx | PRIMARY | 2020-06-04 16:18:24 | size | 2788 | NULL | Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
尝试跑一个SQL观察执行计划,看起来也是正确的:
[root@yejr.run] [test]>desc select count(*) from ttxx;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | ttxx | NULL | index | NULL | PRIMARY | 4 | NULL | 795064 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+--------+----------+-------------+
此时再执行 SHOW TABLE STATUS
依然无法看到信息更新。
到底肿么回事呢?
2. 问题探究
作为老司机(踩坑大户),首先想到的就是检查官方手册。
MySQL官方手册的描述中,有这么一段内容:
• Rows
The number of rows. Some storage engines, such as MyISAM, store the exact count. For other storage engines, such as InnoDB, this value is an approximation, and may vary from the actual value by as much as 40% to 50%. In such cases, use SELECT COUNT(*) to obtain an accurate count.
The Rows value is NULL for INFORMATION_SCHEMA tables.
For InnoDB tables, the row count is only a rough estimate used in SQL optimization. (This is also true if the InnoDB table is partitioned.)
简言之,就是说MyISAM表的Rows是精确值,但InnoDB表则只是大概值,甚至有可能只是真实值的40% ~ 50% 之间。
另外,这个信息是从 INFORMATION_SCHEMA.TABLES
(下面简称IFS.TABLES)里获取的:
Table information is also available from the INFORMATION_SCHEMA TABLES table. See Section 25.36, “The INFORMATION_SCHEMA TABLES Table”.
那我们再看看文档中关于 IFS.TABLES
的描述吧:
25.36 The INFORMATION_SCHEMA TABLES Table
The TABLES table provides information about tables in databases.
Columns in TABLES that represent table statistics hold cached values. The information_schema_stats_expiry system variable defines the period of time before cached table statistics expire. The default is 86400 seconds (24 hours). If there are no cached statistics or statistics have expired, statistics are retrieved from storage engines when querying table statistics columns. To update cached values at any time for a given table, use ANALYZE TABLE. To always retrieve the latest statistics directly from storage engines, set information_schema_stats_expiry to 0. For more information, see Section 8.2.3, “Optimizing INFORMATION_SCHEMA Queries”.
看到这里,真相基本上呼之欲出了。
IFS.TABLES表中看到的数据是有cache的,默认cache时长是 86400秒(即1天),修改参数 information_schema_stats_expiry 即可调整时长。也就是说,除非cache过期了,或者手动执行 ANALYZE TABLE
更新统计信息,否则不会主动更新。
这个参数(功能)是MySQL 8.0后新增的,所以这个问题在8.0之前的版本不存在。
参数 information_schema_stats_expiry 还影响其 IFS.STATISTICS 表。
</div>