今天遇到一个非常神奇的sql执行计划时好时坏,我们一起来领略一下吧
废话不多说,直接进入实战
环境
* version:MySQL5.6.27 社区版
* 表结构
CREATE TABLE `xx` (
`TagId` int(11) NOT NULL AUTO_INCREMENT COMMENT '',
`TagType` int(11) DEFAULT NULL COMMENT '',
`SubType` int(11) DEFAULT NULL COMMENT '',
`CommId` int(11) NOT NULL DEFAULT '0' COMMENT '',
`TagFlag` int(11) NOT NULL DEFAULT '0' COMMENT '',
`TagName` varchar(255) DEFAULT NULL COMMENT '',
`OrderId` int(11) DEFAULT '0' COMMENT '',
`Unum` int(10) NOT NULL DEFAULT '0' COMMENT '',
`IsBest` int(11) NOT NULL DEFAULT '0' COMMENT '',
`BrokerId` int(11) NOT NULL DEFAULT '0' COMMENT '',
`AddDate` int(11) DEFAULT NULL COMMENT '',
`UpdateDate` int(11) DEFAULT NULL COMMENT '',
`updatetime` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`tmpnum` int(10) DEFAULT '0' COMMENT '',
`cityid` int(11) DEFAULT '0' COMMENT '',
PRIMARY KEY (`TagId`),
KEY `idx_4` (`IsBest`,`TagFlag`,`CommId`),
KEY `idxnew` (`UpdateDate`),
KEY `idx_lc_1` (`TagName`,`TagType`,`TagId`),
KEY `idx_lc_2` (`CommId`,`TagName`,`TagType`),
KEY `idx_tagName_brokerId_cityId` (`TagName`,`BrokerId`,`cityid`),
KEY `idx_lc_3` (`SubType`,`TagType`,`cityid`),
) ENGINE=InnoDB AUTO_INCREMENT=20628140 DEFAULT CHARSET=utf8
DB症状
1. slow query 非常多
2. thread_running 非常多
3. cpu 90%
4. too many connection
多症齐发
定位问题
很明显就是去寻找slow query,毕竟slow是我衡量DB性能重要标准。
然后发现99%都是类似这样的语句:
# Time: 170304 10:32:07
# User@Host[] @ [] Id: 26019853
# Query_time: 0.251174 Lock_time: 0.000078 Rows_sent: 1 Rows_examined: 470135
SET timestamp=1488594727;
select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = '1' and `TagName` = '**高' order by `TagId` ASC limit 1 ;
分析问题
- step1:查看执行计划
explain select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = '1' and `TagName` ='**高' order by `TagId` limit 1;
+----+-------------+-----------------+-------+--------------------------------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+--------------------------------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | xx | index | idx_lc_1,idx_tagName_brokerId_cityId | PRIMARY | 4 | NULL | 175 | Using where |
+----+-------------+-----------------+-------+--------------------------------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
这条语句执行时间是: 0.99s
奇怪,从表结构上看,应该会使用idx_lc_1才对,为什么执行计划是错的呢?
- step2:第二反应
会不会是TagType是int类型,但是sql语句确实字符串呢?隐士类型转换的导致的执行计划出错之前也是碰到过的。
试试吧,
explain select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = 1 and `TagName` ='**高' order by `TagId` limit 1;
+----+-------------+-----------------+-------+--------------------------------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+-------+--------------------------------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | xx | index | idx_lc_1,idx_tagName_brokerId_cityId | PRIMARY | 4 | NULL | 175 | Using where |
+----+-------------+-----------------+-------+--------------------------------------+---------+---------+------+------+-------------+
1 row in set (0.00 sec)
这条语句执行时间是: 0.89s
还是非常缓慢,看来不是这个原因。
- step3:会不会是数据的问题呢?
因为从slow的分布看,基本上都是`TagName` ='**高' 的slow,其他的值也没发现,所以开始怀疑value,调整下看看呢
explain select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = 1 and `TagName` ='%%高' order by `TagId` limit 1;
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
| 1 | SIMPLE | xx | ref | idx_lc_1,idx_tagName_brokerId_cityId | idx_lc_1 | 773 | const,const | 3 | Using index condition; Using where |
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
这条语句执行时间:0.00s
哇塞,0s就解决战斗,但是这又是为什么呢?
再试一下:将‘**’高,换成‘*高’
explain select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = 1 and `TagName` ='*高' order by `TagId` limit 1;
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
| 1 | SIMPLE | xx | ref | idx_lc_1,idx_tagName_brokerId_cityId | idx_lc_1 | 773 | const,const | 3 | Using index condition; Using where |
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
执行计划也正确,执行时间也非常快。
然后笃定的认为问题找到了,竟然是 ‘**’导致的。
当我自己给自己sleep 10s 之后,开始思考,这是为什么呢? 等值匹配跟*有关系吗?
- step4: 再次调整语句
* 去掉limit呢? 因为limit是执行计划的杀手,这个我想大部分DBA知道的吧。。。
explain select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = 1 and `TagName` ='*高' order by `TagId` ;
+----+-------------+-----------------+------+--------------------------------------+-----------------------------+---------+-------+-------+----------------------------------
------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
|
+----+-------------+-----------------+------+--------------------------------------+-----------------------------+---------+-------+-------+----------------------------------
------------------+
| 1 | SIMPLE | xx | ref | idx_lc_1,idx_tagName_brokerId_cityId | idx_tagName_brokerId_cityId | 768 | const | 13854 | Using index condition; Using wher
e; Using filesort |
+----+-------------+-----------------+------+--------------------------------------+-----------------------------+---------+-------+-------+----------------------------------
------------------+
惊奇的发现,执行计划再次发生了改变。。。。
idx_tagName_brokerId_cityId 为什么又冒出来了呢?
那我们再回头看看表结构:
PRIMARY KEY (`TagId`),
KEY `idx_4` (`IsBest`,`TagFlag`,`CommId`),
KEY `idxnew` (`UpdateDate`),
KEY `idx_lc_1` (`TagName`,`TagType`,`TagId`),
KEY `idx_lc_2` (`CommId`,`TagName`,`TagType`),
KEY `idx_tagName_brokerId_cityId` (`TagName`,`BrokerId`,`cityid`),
KEY `idx_lc_3` (`SubType`,`TagType`,`cityid`)
去掉干扰项后:
PRIMARY KEY (`TagId`),
`idx_lc_1` (`TagName`,`TagType`,`TagId`),
`idx_tagName_brokerId_cityId` (`TagName`,`BrokerId`,`cityid`),
执行计划竟然没有选择idx_lc_1,而是idx_tagName_brokerId_cityId,那么这个肯定是干扰索引。
所以,就更加清晰的定位到idx_tagName_brokerId_cityId索引的问题,然后开始调整这个索引,主要是第一个字段TagName的干扰,选择性的问题。
将: KEY `idx_tagName_brokerId_cityId` (`TagName`,`BrokerId`,`cityid`) => KEY `idx_tagName_brokerId_cityId` (`BrokerId`,`TagName`,`cityid`)
- step 5: 再次观察执行计划
explain select `TagId`,`TagType`,`SubType`,`CommId`,`TagFlag`,`TagName`,`OrderId`,`Unum`,`IsBest`,`BrokerId`,`AddDate`,`UpdateDate`,`updatetime`,`tmpnum`,`cityid` from `xx` where `TagType` = 1 and `TagName` ='**高' order by `TagId` limit 1;
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
| 1 | SIMPLE | xx | ref | idx_lc_1,idx_tagName_brokerId_cityId | idx_lc_1 | 773 | const,const | 3 | Using index condition; Using where |
+----+-------------+-----------------+------+--------------------------------------+----------+---------+-------------+------+------------------------------------+
sql执行时间:0.00s
总结
- 至此,问题已经解决,第一个前缀索引是如此的重要。
- 索引调优是门艺术
展望
- 以后如何调整和优化类似的索引执行计划呢?
原则: 高索引基数的filed,必须放前面。
- 希望MySQL的优化器以后越来越强大