前言
2022/8/4 15:50
暑假学习ing
推荐
【MySQL数据库教程天花板,mysql安装到mysql高级,强!硬!-哔哩哔哩】
第10章 索引优化与查询优化
都有哪些维度可以进行数据库调优?简言之:
- 索引失效、没有充分利用到索引―索引建立
- 关联查询太多JOIN(设计缺陷或不得已的需求)――SQL优化
- 服务器调优及各个参数设置(缓冲、线程数等)――调整my.cnf
- 数据过多――分库分表
关于数据库调优的知识点非常分散。不同的DBMS,不同的公司,不同的职位,不同的项目遇到的问题都不尽相同。这里分为三个章节进行细致讲解。
虽然SQL查询优化的技术有很多,但是大方向上完全可以分成物理查询优化和逻辑查询优化两大块:
- 物理查询优化是通过索引和表连接方式等技术来进行优化,这里重点需要掌握索引的使用。
- 逻辑查询优化就是通过SQL等价变换提升查询效率,直白一点就是说,换一种查询写法执行效率可能更高。
1. 数据准备
学员表
插 50万
条,班级表
插1万
条
步骤1:建表
CREATE TABLE `class` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `className` VARCHAR(30) DEFAULT NULL, `address` VARCHAR(40) DEFAULT NULL, `monitor` INT NULL , PRIMARY KEY (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8; CREATE TABLE `student` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `stuno` INT NOT NULL , `name` VARCHAR(20) DEFAULT NULL, `age` INT(3) DEFAULT NULL, `classId` INT(11) DEFAULT NULL, PRIMARY KEY (`id`) #CONSTRAINT `fk_class_id` FOREIGN KEY (`classId`) REFERENCES `t_class` (`id`) ) ENGINE=INNODB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
步骤2:设置参数
- 命令开启:允许创建函数设置:
#让数据库服务器信任函数的创建,否则会报错 set global log_bin_trust_function_creators=1; # 不加global只是当前窗口有效
步骤3:创建函数
保证每条数据都不同
#随机产生字符串 DELIMITER // CREATE FUNCTION rand_string(n INT) RETURNS VARCHAR(255) BEGIN DECLARE chars_str VARCHAR(100) DEFAULT 'abcdefghijklmnopqrstuvwxyzABCDEFJHIJKLMNOPQRSTUVWXYZ'; DECLARE return_str VARCHAR(255) DEFAULT ''; DECLARE i INT DEFAULT 0; WHILE i < n DO SET return_str =CONCAT(return_str,SUBSTRING(chars_str,FLOOR(1+RAND()*52),1)); SET i = i + 1; END WHILE; RETURN return_str; END // DELIMITER ; #假如要删除 #drop function rand_string;
随机产生班级编号
#用于随机产生多少到多少的编号 DELIMITER // CREATE FUNCTION rand_num (from_num INT ,to_num INT) RETURNS INT(11) BEGIN DECLARE i INT DEFAULT 0; SET i = FLOOR(from_num +RAND()*(to_num - from_num+1)) ; RETURN i; END // DELIMITER ; #假如要删除 #drop function rand_num;
步骤4:创建存储过程
#创建往stu表中插入数据的存储过程 DELIMITER // CREATE PROCEDURE insert_stu( START INT , max_num INT ) BEGIN DECLARE i INT DEFAULT 0; SET autocommit = 0; #设置手动提交事务 REPEAT #循环 SET i = i + 1; #赋值 INSERT INTO student (stuno, name ,age ,classId ) VALUES ((START+i),rand_string(6),rand_num(1,50),rand_num(1,1000)); UNTIL i = max_num END REPEAT; COMMIT; #提交事务 END // DELIMITER ; #假如要删除 #drop PROCEDURE insert_stu;
#执行存储过程,往class表添加随机数据 DELIMITER // CREATE PROCEDURE `insert_class`( max_num INT ) BEGIN DECLARE i INT DEFAULT 0; SET autocommit = 0; REPEAT SET i = i + 1; INSERT INTO class ( classname,address,monitor ) VALUES (rand_string(8),rand_string(10),rand_num(1,100000)); UNTIL i = max_num END REPEAT; COMMIT; END // DELIMITER ; #假如要删除 #drop PROCEDURE insert_class;
步骤5:调用存储过程
class
#执行存储过程,往class表添加1万条数据 CALL insert_class(10000);
stu
#执行存储过程,往stu表添加50万条数据 CALL insert_stu(100000,500000);
SELECT COUNT(*) FROM class; #10000 SELECT COUNT(*) FROM student; #500000
步骤6:删除某表上的索引
创建存储过程
DELIMITER // CREATE PROCEDURE `proc_drop_index`(dbname VARCHAR(200),tablename VARCHAR(200)) BEGIN DECLARE done INT DEFAULT 0; DECLARE ct INT DEFAULT 0; DECLARE _index VARCHAR(200) DEFAULT ''; DECLARE _cur CURSOR FOR SELECT index_name FROM information_schema.STATISTICS WHERE table_schema=dbname AND table_name=tablename AND seq_in_index=1 AND index_name <>'PRIMARY' ; #每个游标必须使用不同的declare continue handler for not found set done=1来控制游标的结束 DECLARE CONTINUE HANDLER FOR NOT FOUND set done=2 ; #若没有数据返回,程序继续,并将变量done设为2 OPEN _cur; FETCH _cur INTO _index; WHILE _index<>'' DO SET @str = CONCAT("drop index " , _index , " on " , tablename ); PREPARE sql_str FROM @str ; EXECUTE sql_str; DEALLOCATE PREPARE sql_str; SET _index=''; FETCH _cur INTO _index; END WHILE; CLOSE _cur; END // DELIMITER ;
执行存储过程
CALL proc_drop_index("dbname","tablename");
2.索引失效案例
MySQL中提高性能
的一个最有效的方式是对数据表设计合理的索引
。索引提供了高效访问数据的方法,并且加快查询的速度,因此索引对查询的速度有着至关重要的影响。
使用索引可以快速地定位表中的某条记录,从而提高数据库查询的速度,提高数据库的性能
如果查询时没有使用索引,查询语句就会扫描表中的所有记录。在数据量大的情况下,这样查询的速度会很慢。
大多数情况下都〈默认)采用B+树来构建索引。只是空间列类型的索引使用R-树,并且MEMORY表还支持hash索引。
其实,用不用索引,最终都是优化器说了算。优化器是基于什么的优化器?基于cost开销 (CostBaseOptimizer ),它不是基于规则(Rule-BasedOptimizer),也不是基于语义。怎么样开销小就怎么来。另外,SQL语句是否使用索引,跟数据库版本、数据量、数据选择度都有关系。
2.1全值匹配
#1)全值匹配我最爱 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30; /*语句一:没有索引 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 10.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30 AND classId=4; /*语句二:没有索引 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 1.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30 AND classId=4 AND NAME = 'abcd'; /*语句三:没有索引 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 0.10 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ SELECT SQL_NO_CACHE * FROM student WHERE age=30 AND classId=4 AND NAME = 'abcd'; #Empty set, 1 warning (0.13 sec) EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30; /* +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_age,idx_age_classid,idx_age_classid_name | idx_age | 5 | const | 10182 | 100.00 | NULL | +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------+ */ #分别为student表创建三个索引: CREATE INDEX idx_age ON student(age);#索引一 CREATE INDEX idx_age_classid ON student(age,classId);#索引二 CREATE INDEX idx_age_classid_name ON student(age,classId,NAME);#索引三 #显示student表上的索引 SHOW INDEX FROM student; #再次执行有索引的语句一: EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30; /*使用了索引一:索引字段为age +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_age,idx_age_classid,idx_age_classid_name | idx_age | 5 | const | 10182 | 100.00 | NULL | +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------+ */ #再次执行有索引的语句二: EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30 AND classId=4; /*使用了索引二:索引字段为age、classId +----+-------------+---------+------------+------+----------------------------------------------+-----------------+---------+-------------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------------------+-----------------+---------+-------------+------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_age,idx_age_classid,idx_age_classid_name | idx_age_classid | 10 | const,const | 9 | 100.00 | NULL | +----+-------------+---------+------------+------+----------------------------------------------+-----------------+---------+-------------+------+----------+-------+ */ #再次执行有索引的语句三: EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age=30 AND classId=4 AND NAME = 'abcd'; /*使用了索引三:索引字段为age、classId 、NAME +----+-------------+---------+------------+------+----------------------------------------------+----------------------+---------+-------------------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------------------+----------------------+---------+-------------------+------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_age,idx_age_classid,idx_age_classid_name | idx_age_classid_name | 73 | const,const,const | 1 | 100.00 | NULL | +----+-------------+---------+------------+------+----------------------------------------------+----------------------+---------+-------------------+------+----------+-------+ */ #全值匹配表示索引列和查询条件的字段全部匹配,精度高,key_len长度大 #结论: #当创建多个索引时,查询优化器通常会选取和查询字段匹配度最高的索引 #因为匹配度越高,查询效率越快 #此时除被选中的索引外,其它索引失效
补充:
上面SQL语句中SQL_NO_CACHE的使用保证不存在查询缓存,使各语句的比较不受“是否缓存”的影响,从而达到了“控制变量”的目的
2.2最佳左前缀法则
在MySQL建立联合索引时会遵守最佳左前缀匹配原则,即最左优先,在检索数据时从联合索引的最左边开始匹配
#2)最佳左前缀法则 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.age=30 AND student.name = 'abcd' ; /*使用了索引一:索引字段为age +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------------+ | 1 | SIMPLE | student | NULL | ref | idx_age,idx_age_classid,idx_age_classid_name | idx_age | 5 | const | 10182 | 10.00 | Using where | +----+-------------+---------+------------+------+----------------------------------------------+---------+---------+-------+-------+----------+-------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.classid=1 AND student.name = 'abcd'; /*没有使用索引:因为没有classid索引、也没有(classid、name)索引 没有使用(age,classId,NAME)索引的原因:不符合最佳左前缀法则 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 1.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE classid=4 AND student.age=30 AND student.name = 'abcd'; /*使用了索引三(age,classId,NAME):索引字段为age、classId 、NAME +----+-------------+---------+------------+------+----------------------------------------------+----------------------+---------+-------------------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+----------------------------------------------+----------------------+---------+-------------------+------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_age,idx_age_classid,idx_age_classid_name | idx_age_classid_name | 73 | const,const,const | 1 | 100.00 | NULL | +----+-------------+---------+------------+------+----------------------------------------------+----------------------+---------+-------------------+------+----------+-------+ */
结论:
MySQL可以为多个字段创建索引,一个索引可以包括16个字段。对于多列索引,过滤条件要使用索引必须按照索引建立时的顺序,依次满足,一旦跳过某个字段,索引后面的字段都无法被使用。如果查询条件中没有使用这些字段中第1个字段时,多列(或联合)索引不会被使用。
拓展: Alibaba 《Java开发手册》
索引文件具有B-Tree的最左前缀匹配特性,如果左边的值未确定,那么无法使用此索引
2.3主键插入顺序
对于一个使用InnoDB存储引擎的表来说,在没有显式的创建索引时,表中的数据实际上都是存储在聚簇索引的叶子节点的。而记录又是存储在数据页中的,数据页和记录又是按照记录主键值从小到大的顺序进行排序,所以如果插入的记录的主键值是依次增大的话,那每插满一个数据页就换到下一个数据页继续插,而如果插入的主键值忽大忽小的话(一般不让这种情况发生),就比较麻烦了,假设某个数据页存储的记录已经满了,它存储的主键值在1~100之间:
如果此时再插入一条主键值为 9
的记录,那它插入的位置就如下图:
可这个数据页已经满了,再插进来咋办呢?
需要把当前 页面分裂 成两个页面,把本页中的一些记录移动到新创建的这个页中。
页面分裂和记录移位意味着什么?
意味着: 性能损耗 !所以如果想尽量避免这样无谓的性能损耗,最好让插入的记录的 主键值依次递增 ,这样就不会发生这样的性能损耗了。
所以建议:让主键具有 AUTO_INCREMENT ,让存储引擎自己为表生成主键,而不是手动插入 ,比如: person_info 表:
CREATE TABLE person_info( id INT UNSIGNED NOT NULL AUTO_INCREMENT, name VARCHAR(100) NOT NULL, birthday DATE NOT NULL, phone_number CHAR(11) NOT NULL, country varchar(100) NOT NULL, PRIMARY KEY (id), KEY idx_name_birthday_phone_number (name(10), birthday, phone_number) );
我们自定义的主键列 id 拥有AUTO_INCREMENT 属性,在插入记录时存储引擎会自动为我们填入自增的主键值。这样的主键占用空间小,顺序写入,减少页分裂。
# 3)主键插入顺序 #略
2.4 计算、函数、类型转换(自动或手动)导致索引失效
#4)计算、函数、类型转换(自动或手动)导致索引失效 CREATE INDEX idx_name ON student(NAME);#创建索引(NAME) #此语句比下一条要好!(能够使用上索引) EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.name LIKE 'abc%'; /* +----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | student | NULL | range | idx_name | idx_name | 63 | NULL | 36 | 100.00 | Using index condition | +----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ */ #LEFT(student.name,3) = 'abc'; 中left函数的使用导致索引失效 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE LEFT(student.name,3) = 'abc'; /* +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 100.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ #再举例 CREATE INDEX idx_sno ON student(stuno);#创建索引(stuno) EXPLAIN SELECT SQL_NO_CACHE id, stuno, NAME FROM student WHERE stuno+1 = 900001; /*使用“stuno+1 = 900001”算术运算导致索引失效 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 100.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ EXPLAIN SELECT SQL_NO_CACHE id, stuno, NAME FROM student WHERE stuno = 900000; /*使用了索引: +----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_sno | idx_sno | 4 | const | 1 | 100.00 | NULL | +----+-------------+---------+------------+------+---------------+---------+---------+-------+------+----------+-------+ */ #再举例 EXPLAIN SELECT id, stuno, NAME FROM student WHERE SUBSTRING(NAME, 1,3)='abc'; /*索引失效 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 100.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */
2.5 类型转换导致索引失效
#5)类型转换导致索引失效 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE NAME = 123; /*没有使用索引:name是字符串类型,和int匹配要类型转换 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | idx_name | NULL | NULL | NULL | 498858 | 10.00 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE NAME = '123'; /*使用了索引: +----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------+ | 1 | SIMPLE | student | NULL | ref | idx_name | idx_name | 63 | const | 1 | 100.00 | NULL | +----+-------------+---------+------------+------+---------------+----------+---------+-------+------+----------+-------+ */
2.6 范围条件右边的列索引失效
范围条件:含(<) (<=) (>) (>=)和between等的条件
#6)范围条件右边的列索引失效 SHOW INDEX FROM student; CALL proc_drop_index('atguigudb2','student');#清空所有student表的索引 CREATE INDEX idx_age_classId_name ON student(age,classId,NAME);#创建联合索引 idx_age_classId_name EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.age=30 AND student.classId>20 AND student.name = 'abc' ; /*Using index condition表示:有些搜索条件中虽然出现了索引列,但却不能使用到索引 #使用了索引 idx_age_classId_name但是只用了联合索引的前两个字段 # 结合`age` INT(3)占5 、 `classId` INT(11)占5 以及key_len=10可知只使用了前两个字段 +----+-------------+---------+------------+-------+----------------------+----------------------+---------+------+-------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+----------------------+----------------------+---------+------+-------+----------+-----------------------+ | 1 | SIMPLE | student | NULL | range | idx_age_classId_name | idx_age_classId_name | 10 | NULL | 18456 | 10.00 | Using index condition | +----+-------------+---------+------------+-------+----------------------+----------------------+---------+------+-------+----------+-----------------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.age=30 AND student.name = 'abc' AND student.classId>20; /*对于优化器来说AND连接的这几个条件可以任意颠倒,故此SQL语句和上一句执行效果一样 联合索引classId在中间 +----+-------------+---------+------------+-------+----------------------+----------------------+---------+------+-------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+----------------------+----------------------+---------+------+-------+----------+-----------------------+ | 1 | SIMPLE | student | NULL | range | idx_age_classId_name | idx_age_classId_name | 10 | NULL | 18456 | 10.00 | Using index condition | +----+-------------+---------+------------+-------+----------------------+----------------------+---------+------+-------+----------+-----------------------+ */ #建一个新的索引 CREATE INDEX idx_age_name_cid ON student(age,NAME,classId); EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.age=30 AND student.name = 'abc' AND student.classId>20; /*在新索引下三个字段都用上了 +----+-------------+---------+------------+-------+---------------------------------------+------------------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------------------------------+------------------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | student | NULL | range | idx_age_classId_name,idx_age_name_cid | idx_age_name_cid | 73 | NULL | 1 | 100.00 | Using index condition | +----+-------------+---------+------------+-------+---------------------------------------+------------------+---------+------+------+----------+-----------------------+ */ #补充说明: #对于优化器来说AND连接的字段先写哪个后写哪个无所谓 #具体使用了哪几个字段只和索引中定义字段的位置以及哪个字段使用了范围查询有关 #“范围条件右边的列”中的右-->是左是右要看索引中定义字段的相对位置,而不是字段在where中的位置
启发:
应用开发中范围查询,例如:金额查询,日期查询往往都是范围查询。应将查询条件放置where语句最后,建索引时也放在最后
(创建的联合索引中,务必把范围涉及到的字段写在最后)
2.7 不等于(!= 或者<>)索引失效
#7)不等于(!= 或者<>)索引失效 #不等于时用不上B+树,只能一个一个查找 CREATE INDEX idx_name ON student(NAME); EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.name <> 'abc' ; #或 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE student.name != 'abc' ; /*索引失效 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | idx_name | NULL | NULL | NULL | 498858 | 50.15 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */
2.8 is null可以使用索引,is not null无法使用索引
#8)is null可以使用索引,is not null无法使用索引 #is null可以使用索引 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age IS NULL; #is not null无法使用索引 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age IS NOT NULL;
结论:最好在设计数据表的时候就将字段设置为 NOT NULL约束
,比如可以将INT类型的字段,默认值设置为0。将字符类型的默认值设置为空字符串’’’’。
拓展:同理,在查询中使用not like
也无法使用索引,导致全表扫描
2.9 like以通配符%开头索引失效
在使用LIKE关键字进行查询的查询语句中,如果匹配字符串的第一个字符为"%”,索引就不会起作用。只有"%"不在第一个位置,索引才会起作用
#9)like以通配符%开头索引失效 EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE NAME LIKE 'ab%'; /*使用了索引 +----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ | 1 | SIMPLE | student | NULL | range | idx_name | idx_name | 63 | NULL | 711 | 100.00 | Using index condition | +----+-------------+---------+------------+-------+---------------+----------+---------+------+------+----------+-----------------------+ */ EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE NAME LIKE '%ab%'; /* +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | NULL | NULL | NULL | NULL | 498858 | 11.11 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */
拓展:Alibaba《Java开发手册》
【强制】页面搜索严禁左模糊或者全模糊,如果需要请走搜索引擎来解决。
2.10 OR 前后存在非索引的列,索引失效
在WHERE子句中,如果在OR前的条件列进行了索引,而在OR后的条件列没有进行索引,那么索引会失效。也就是说,OR前后的两个条件中的列都是索引时,查询中才使用索引
因为OR的含义就是两个只要满足一个即可,因此只有一个条件列进行了索引是没有意义的,只要有条件列没有进行索引,就会进行全表扫描,因此索引的条件列也会失效
#10)OR 前后存在非索引的列,索引失效 SHOW INDEX FROM student; CALL proc_drop_index('atguigudb2','student'); CREATE INDEX idx_age ON student(age); EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age = 10 OR classid = 100; /*没有使用索引 +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ | 1 | SIMPLE | student | NULL | ALL | idx_age | NULL | NULL | NULL | 498858 | 11.88 | Using where | +----+-------------+---------+------------+------+---------------+------+---------+------+--------+----------+-------------+ */ #为前后两个索引都创建索引,则OR连接他们时就可以使用索引 CREATE INDEX idx_cid ON student(classid); EXPLAIN SELECT SQL_NO_CACHE * FROM student WHERE age = 10 OR classid = 100; /*因为age字段和classid字段上都有索引,所以查询中使用了索引 +----+-------------+---------+------------+-------------+-----------------+-----------------+---------+------+-------+----------+-------------------------------------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+---------+------------+-------------+-----------------+-----------------+---------+------+-------+----------+-------------------------------------------+ | 1 | SIMPLE | student | NULL | index_merge | idx_age,idx_cid | idx_age,idx_cid | 5,5 | NULL | 10612 | 100.00 | Using union(idx_age,idx_cid); Using where | +----+-------------+---------+------------+-------------+-----------------+-----------------+---------+------+-------+----------+-------------------------------------------+ 能看到这里使用到了index_merge,简单来说index_merge就是对age和classid分别进行了扫描,然后将这两个结果集进行了合并。这样做的好处就是避免了全表扫描 */
2.11 不同的字符集进行比较前需要进行 转换、会造成索引失效
统一使用utf8mb4( 5.5.3版本以上支持)兼容性更好,统一字符集可以避免由于字符集转换产生的乱码。不同的字符集
进行比较前需要进行 转换
会造成索引失效
建议:数据库和表的字符集统一使用utf8mb4
2.12练习及一般性建议练习:
假设: index(a,b.c)
一般性建议:
对于单列索引,尽量选择针对当前query过滤性更好的索引
在选择组合索引的时候,当前query中过滤性最好的字段在索引字段顺序中,位置越靠前越好
在选择组合索引的时候,尽量选择能够包含当前query中的where了句中更多字段的索引
在选择组合索引的时候,如果某个字段可能出现范围查询时,尽量把这个字段放在索引次序的最后面
总之,书写SQL语句时,尽量避免造成索引失效的情况
3. 关联查询优化(多表查询)
3.1 数据准备
#分类 CREATE TABLE IF NOT EXISTS `type` ( `id` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `card` INT(10) UNSIGNED NOT NULL, PRIMARY KEY (`id`) ); #图书 CREATE TABLE IF NOT EXISTS `book` ( `bookid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT, `card` INT(10) UNSIGNED NOT NULL, PRIMARY KEY (`bookid`) ); #向分类表中添加20条记录 INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO TYPE(card) VALUES(FLOOR(1 + (RAND() * 20))); #向图书表中添加20条记录 INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20))); INSERT INTO book(card) VALUES(FLOOR(1 + (RAND() * 20)));
3.2 采用左外连接
下面开始 EXPLAIN 分析
# 情况1:左外连接 #连接的时候就和“嵌套循环”一样 #每次从驱动表里选取一条记录去被驱动表里整个遍历一遍 #将符合连接条件的放到结果集中 #驱动表和被驱动表-->EXPLAIN执行结果的记录中,上面的是驱动表,下面的是被驱动表 EXPLAIN SELECT SQL_NO_CACHE * FROM `type` LEFT JOIN book ON type.card = book.card;
结论:type 有All
添加索引优化
#添加索引优化 #给被驱动表加了索引可以避免全表扫描 ALTER TABLE book ADD INDEX Y ( card); #【被驱动表】,可以避免全表扫描 EXPLAIN SELECT SQL_NO_CACHE * FROM `type` LEFT JOIN book ON type.card = book.card;
可以看到第二行的 type 变为了 ref,rows 也变成了1,优化比较明显。这是由左连接特性决定的。LEFT JOIN条件用于确定如何从右表搜索行,左边一定都有,所以右边是我们的关键点,一定需要建立索引 。
#给驱动表加了索引也要全表扫描 ALTER TABLE `type` ADD INDEX X (card); #【驱动表】,无法避免全表扫描 EXPLAIN SELECT SQL_NO_CACHE * FROM `type` LEFT JOIN book ON type.card = book.card;
接着:
DROP INDEX Y ON book; EXPLAIN SELECT SQL_NO_CACHE * FROM `type` LEFT JOIN book ON type.card = book.card;