1 演示环境的搭建
1.1 MySQL数据库的安装
在LInux上安装MySQL有多种方式可以选择,最简单的是使用rpm包安装,这种方式的缺点是提供的安装选项不多,不能选择自己的数据目录、安装目录等。最灵活的安装方式是源码编译安装,可以根据需要选择编译的功能模块,但是安装过程比较复杂。在灵活性和安装复杂度折衷一点的选择是二进制包安装,可以根据需要选择安装目录和数据目录,安装过程也不算太复杂,建议选择这种方式进行MySQL安装。
从MySQL官网下载数据库二机制安装包,这里选择的MySQL5.7的最新版本,下载后解压到/usr/local目录, tar的-C选项指定解压目录。
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# tar -xvf mysql-5.7.34-linux-glibc2.12-x86_64.tar.gz -C /usr/local
切换到/usr/local目录,创建一个软链接mysql连接至解压后的安装软件,创建软链接的目的是管理和使用的方便,如果在一台服务器上安装了多个版本的MySQL,只需要将mysql软链接删除,链接至不同版本的安装软件即可实现版本的切换。
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# cd /usr/local
[root@iZ2ze0t8khaprrpfvmevjiZ local]# ln -s mysql-5.7.34-linux-glibc2.12-x86_64 mysql
[root@iZ2ze0t8khaprrpfvmevjiZ local]# ls -l
total 0
drwxr-xr-x 7 root root 160 Jul 28 16:33 aegis
drwxr-xr-x. 2 root root 210 Jul 20 17:27 bin
drwxr-xr-x. 2 root root 6 Jun 22 2021 etc
lrwxrwxrwx 1 root root 35 Aug 9 13:55 mysql -> mysql-5.7.34-linux-glibc2.12-x86_64
drwxr-xr-x 9 root root 129 Aug 9 13:53 mysql-5.7.34-linux-glibc2.12-x86_64
将MySQL执行文件所在的目录加入到PATH环境变量中,这样在运行MySQL相关命令时不用切换到mysql 命令所在的目录。
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# echo 'export PATH=$PATH:/usr/local/mysql/bin' >> ~/.bash_profile
运行一下用户默认登录脚本,使新设置的PATH变量生效
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# source .bash_profile
检查当前环境变量,MySQL执行文件所在目录已加入到PATH环境变量之中。
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# env|grep -i path
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/0/busPATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin:/root/bin:/usr/local/mysql/bin
创建mysql用户组
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# groupadd mysql
创建mysql用户,用户的shell设置为/bin/false,禁止mysql用户登录。
[root@iZ2ze0t8khaprrpfvmevjiZ ~]# useradd -r -g mysql -s /bin/false mysql
创建mysql数据目录,改变目录的属主为mysql:mysql
[root@iZ2ze0t8khaprrpfvmevjiZ mysql]# mkdir /mysqldata
[root@iZ2ze0t8khaprrpfvmevjiZ mysql]# chown mysql:mysql /mysqldata
1.2 MySQL数据库初始化及启动
MySQL的安装环境准备好之后就可以对数据库进行初始化了,初始化时要指定数据库的操作系统用户,因为我这里使用的不是默认目录,所以也要指定数据目录。
[root@iZ2ze0t8khaprrpfvmevjiZ bin]# mysqld --initialize --user=mysql --datadir=/mysqldata
2022-08-09T06:07:51.802610Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2022-08-09T06:07:52.836022Z 0 [Warning] InnoDB: New log files created, LSN=45790
2022-08-09T06:07:52.936636Z 0 [Warning] InnoDB: Creating foreign key constraint system tables.
2022-08-09T06:07:53.014709Z 0 [Warning] No existing UUID has been found, so we assume that this is the first time that this server has been started. Generating a new UUID: 9aaac001-17a9-11ed-b423-00163e2eafc6.
2022-08-09T06:07:53.017913Z 0 [Warning] Gtid table is not ready to be used. Table 'mysql.gtid_executed' cannot be opened.
2022-08-09T06:07:54.308559Z 0 [Warning] CA certificate ca.pem is self signed.
2022-08-09T06:07:54.512118Z 1 [Note] A temporary password is generated for root@localhost: R_!Y2zfqq5kC
初始化成功后会创建一个临时的root账号密码,记下此密码后启动数据库,用mysqld_safe命令,同样需要指定操作系统用户和数据目录,放入后台运行。
[root@iZ2ze0t8khaprrpfvmevjiZ bin]# mysqld_safe --user=mysql --datadir=/mysqldata &
登录数据库,输入上面显示的临时root密码
[root@iZ2ze0t8khaprrpfvmevjiZ bin]# mysql -uroot -p
Enter password:
在运行命令时提示在执行命令之前必须重置密码
mysql> show databases;
ERROR 1820 (HY000): You must reset your password using ALTER USER statement before executing this statement.
使用下面的命令重置root密码,退出数据库使用重置后的密码登录即可执行查询命令。
mysql> ALTER USER USER() IDENTIFIED BY "******";
Query OK, 0 rows affected (0.00 sec)
mysql> select version();
+-----------+
| version() |
+-----------+
| 5.7.34 |
+-----------+
1 row in set (0.00 sec)
1.3 导入sakila示例数据库
从MySQL官网上下载sakila示例数据库的创建脚本,解压后的脚本文件有两个,sakila-schema.sql创建sakila数据库及表和索引,sakila-data.sql数据库插入数据。这里的实验环境时ECS,最初从ECS控制台上用文件上传,试了之后发现上传文件最大限制为16k,改用psftp上传,psftp时putty的上传工具,只要服务端打开了ssh服务就可以使用,十分方便。
上传的目录是/root/,切换到这个目录,登录MySQL数据库运行这两个脚本即可。
mysql> source sakila-schema.sql
mysql> source sakila-data.sql
1.4 主要示例表介绍
本文使用sakila中的比较简答的city表来演示,这个表数据量、列数比较少,也有外键,主键,可以说是麻雀虽小,五脏俱全。下面是表的定义
mysql> show create table city\G;
*************************** 1. row ***************************
Table: city
Create Table: CREATE TABLE `city` (
`city_id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`city` varchar(50) NOT NULL,
`country_id` smallint(5) unsigned NOT NULL,
`last_update` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`city_id`),
KEY `idx_fk_country_id` (`country_id`),
CONSTRAINT `fk_city_country` FOREIGN KEY (`country_id`) REFERENCES `country` (`country_id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=601 DEFAULT CHARSET=utf8mb4
1 row in set (0.00 sec)
表上定义了两个索引
mysql> show index from city;
+-------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| city | 0 | PRIMARY | 1 | city_id | A | 600 | NULL | NULL | | BTREE | | |
| city | 1 | idx_fk_country_id | 1 | country_id | A | 600 | NULL | NULL | | BTREE | | |
+-------+------------+-------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
2 MySQL数据库执行计划
先从一个简单的例子看一下MySQL的执行计划,查看的命令是explain,使用explain查看sql语句的执行计划并不执行sql语句,只是解释输出语句的执行计划,对数据库不会造成什么影响,可以放心使用。
mysql> explain select city from city;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | SIMPLE | city | NULL | ALL | NULL | NULL | NULL | NULL | 600 | 100.00 | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
上面的语句是MySQL数据库中最简单的语句,从上面可以看到,执行计划的输出有好多列,这些列的信息对推测判断sql语句的性能都有或大或小的帮助。有些列的信息非常重要,比如table,type,possible_keys,key,extra这些列,通过这些列的信息有时可以直接判断出sql语句的问题处在哪里,应该怎样对其优化。比如对上面的语句,从type的值ALL可以判断出语句执行了全表扫描,而扫描的对象是table的值city,语句的执行过程中没有可以使用的索引。
3 MySQL执行计划关键信息解释
3.1 id
id是执行计划的第一列,即操作的序列号,如果没有子查询或者是union,就只有一个查询,各行的id值是1,否则,内部的sql语句就会按序号标识。执行计划输出的每一行是关于一个表的操作,执行的顺序是从上到下依次执行。执行计划的id号并不总是从小到大排列,即id小的操作并不总是在id大的操作之前执行,下面是一个例子:
mysql> explain select a.customer_id, a.address_id, a.first_name, a.last_name
from customer a inner join address b on a.address_id=b.address_id
inner join city c on b.city_id =c.city_id
where c.country_id in (select country_id from country where country<>"Yemen"
union
select country_id from country where country<>"Vietnam")
and a.store_id not in (select store_id from store where manager_staff_id=2);
-+---------------------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------+------------+--------+----------------------------+--------------------+---------+---------------------+------+----------+-----------------+
| 1 | PRIMARY | a | NULL | ALL | idx_fk_address_id | NULL | NULL | NULL | 599 | 100.00 | Using where |
| 1 | PRIMARY | b | NULL | eq_ref | PRIMARY,idx_fk_city_id | PRIMARY | 2 | sakila.a.address_id | 1 | 100.00 | NULL |
| 1 | PRIMARY | c | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.b.city_id | 1 | 100.00 | Using where |
| 4 | SUBQUERY | store | NULL | const | PRIMARY,idx_unique_manager | idx_unique_manager | 1 | const | 1 | 100.00 | Using index |
| 2 | DEPENDENT SUBQUERY | country | NULL | eq_ref | PRIMARY | PRIMARY | 2 | func | 1 | 90.00 | Using where |
| 3 | DEPENDENT UNION | country | NULL | eq_ref | PRIMARY | PRIMARY | 2 | func | 1 | 90.00 | Using where |
| NULL | UNION RESULT | | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
+----+--------------------+------------+------------+--------+----------------------------+--------------------+---------+---------------------+------+----------+-----------------+
上面的语句既有子查询也有联合(union),执行计划输出中各行的id并不都是1,对于store子查询的id是4,却在对于country的子查询(id为2)和对于country的union(id为3)之前执行。把上面的语句的where条件的顺序变一下,再看一下执行计划
mysql> explain select a.customer_id, a.address_id, a.first_name, a.last_name
from customer a inner join address b on a.address_id=b.address_id
inner join city c on b.city_id =c.city_id
where
a.store_id not in (select store_id from store where manager_staff_id=2) and
c.country_id in (select country_id from country where country<>"Yemen"
union
select country_id from country where country<>"Vietnam");
-+---------------------+------+----------+-----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+------------+------------+--------+----------------------------+--------------------+---------+---------------------+------+----------+-----------------+
| 1 | PRIMARY | a | NULL | ALL | idx_fk_address_id | NULL | NULL | NULL | 599 | 100.00 | Using where |
| 1 | PRIMARY | b | NULL | eq_ref | PRIMARY,idx_fk_city_id | PRIMARY | 2 | sakila.a.address_id | 1 | 100.00 | NULL |
| 1 | PRIMARY | c | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.b.city_id | 1 | 100.00 | Using where |
| 3 | DEPENDENT SUBQUERY | country | NULL | eq_ref | PRIMARY | PRIMARY | 2 | func | 1 | 90.00 | Using where |
| 4 | DEPENDENT UNION | country | NULL | eq_ref | PRIMARY | PRIMARY | 2 | func | 1 | 90.00 | Using where |
| NULL | UNION RESULT | | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary |
| 2 | SUBQUERY | store | NULL | const | PRIMARY,idx_unique_manager | idx_unique_manager | 1 | const | 1 | 100.00 | Using index |
+----+--------------------+------------+------------+--------+----------------------------+--------------------+---------+---------------------+------+----------+-----------------+
7 rows in set, 1 warning (0.00 sec)
对于store的子查询的id变成了2,执行顺序却放到了最后。从上面的两个sql的输出来看,操作的id好像与出现在where条件的位置有关,不过不能十分肯定。但是可以肯定一点的是,对于复杂的sql语句,where条件的执行顺序可能会影响到执行计划的选择,有可能会应影响到语句的执行性能,MySQL数据库最好不要执行复杂的sql语句。
3.2 select_type,possible keys, key,table,ref
table:操作的行所在的表,可以是索引,表,union, derived(派生表),子查询(物化的子查询)。
possible_keys:可用的索引
key:实际用到的索引
ref:哪一些列或常量被用来同索引(key列)比较来从表中选择行,如果这一列的值是func,使用的值是某些函数计算的结果,具体使用哪个func可以在explain语句后使用showwarning来查看。
执行的类型,simple是指简单的select操作,没有用到子查询或联合。primary是指最外层的查询,subquery,子查询里的第一个查询,dependent subquery,子查询里的第一个查询,依赖于外层查询,derived 派生表,MATERIALIZED 物化子查询,UNCACHEABLE SUBQUERY,子查询的结果不能被缓存,对外层查询的每一行,必须重新运行子查询。依赖意味着运行了关联子查询。
在编写sql时,要尽可能避免dependent subquery,UNCACHEABLE SUBQUERY,这两个类型的子查询一是影响性能,而是优化器无法对其进行优化
除了select操作外,类型也可能时insert、delete和update,下面是一个insert的例子
mysql> explain insert into city(city, country_id) values ('beitang', 108), ('johntown', '109');
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| 1 | INSERT | city | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set (0.00 sec)
insert的执行计划里只有select type和table,其它的信息都为空。MySQL的insert要执行主键和唯一键的唯一性检测,应该会用到主键索引,执行计划里并没有显示。
mysql> explain delete from city where city_id=108;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
| 1 | DELETE | city | NULL | range | PRIMARY | PRIMARY | 2 | const | 1 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
delete的执行计划里可以看到使用了主键,从上面的执行计划可以看出,语句执行的可用键是主键(possible_keys),实际使用的键也是主键(key),用常量ref值为const和主键作比较,因此,在执行delete操作时,要考虑索引的使用,在单行删除时尽可能使用主键或唯一索引作为删除条件。
mysql> explain delete from city where country_id=108;
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+-------+------+----------+-------------+
| 1 | DELETE | city | NULL | range | idx_fk_country_id | idx_fk_country_id | 2 | const | 2 | 100.00 | Using where |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+-------+------+----------+-------------+
1 row in set (0.00 sec)
非唯一索引也可以用来加速delete操作。
再看一个两表关联的例子
mysql> explain select a.city, b.country from city a , country b where a.country_id=b.country_id;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+---------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+---------------------+------+----------+-------+
| 1 | SIMPLE | b | NULL | ALL | PRIMARY | NULL | NULL | NULL | 109 | 100.00 | NULL |
| 1 | SIMPLE | a | NULL | ref | idx_fk_country_id | idx_fk_country_id | 2 | sakila.b.country_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+---------------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
从上面的执行计划可以看出,mysql先对表b进行全表扫描(执行计划的第一行),可用的keys是主键,因为是全表扫描,所以实际上没有使用主键,扫描出了109行。
从执行计划的第二行可以看出,用b表的country_id(ref)同a表的索引idx_fk_country_id进行比较匹配,连接后进行输出。
在上面的sql语句中加上一个对a表的限定条件后,再看一下执行计划。
mysql> explain select a.city, b.country from city a , country b where a.country_id=b.country_id and a.city_id=1;
+----+-------------+-------+------------+-------+---------------------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | a | NULL | const | PRIMARY,idx_fk_country_id | PRIMARY | 2 | const | 1 | 100.00 | NULL |
| 1 | SIMPLE | b | NULL | const | PRIMARY | PRIMARY | 2 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------------------+---------+---------+-------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
在a表加了限定条件后上面的查询的意义是查询city_id为1 的城市的名称及其所属国,家,对表a(city)的查询可用的索引有两个,因为查询条件是主键,所以实际查询时采用了主键,用常量和主键进行比较,获得了一行数据。从第二行可以看出,对b表的查询也是用常量(a表的输出是一行数据)和b表的主键进行匹配。
3.3 type
type,在MySQL官方文档中是join类型,实际是dba常说的访问类型。不同的访问类型在性能上有很大的差异,在编写sql和sql诊断时,首先要判断type是否合适。在执行计划中,可以看到以下几种访问类型。
NULL:MySQL在优化阶段就会解决查询,取得查询的值,不需要访问表。
system:当表中只有一行时,访问类型是system,system可以看作是const的特例。看一个例子:
mysql> explain select * from (select now()) a;
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
| 1 | PRIMARY | | NULL | system | NULL | NULL | NULL | NULL | 1 | 100.00 | NULL |
| 2 | DERIVED | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+------------+------------+--------+---------------+------+---------+------+------+----------+----------------+
2 rows in set, 1 warning (0.00 sec)
上面的执行计划中,第二个派生表的join类型是null,意味着在优化阶段就取得了now()的值,后面的extra的值是No tables used意思是在执行阶段没有访问任何表。第一个操作的类型是system,访问的表是派生表2(第二个操作产生的派生表),这个派生表只有一行,因此访问类型是system。
const:表中最多只有一个匹配的行。由于只读一次表,const表的速度非常快,当主键或唯一键同常量比较时MySQL会用到const。用主键查询一行数据就会用到const,像下面的例子:
mysql> explain select city from city where city_id=1;
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
| 1 | SIMPLE | city | NULL | const | PRIMARY | PRIMARY | 2 | const | 1 | 100.00 | NULL |
+----+-------------+-------+------------+-------+---------------+---------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
eq_ref:对前面每一个表的行组合,从表中读取一行。这是除了const和system之外最好的join类型,使用时的索引是主键和非空唯一键。
mysql> explain select a.city from city a , address b where a.city_id=b.city_id and b.postal_code<23616;
+----+-------------+-------+------------+--------+----------------+---------+---------+------------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+----------------+---------+---------+------------------+------+----------+-------------+
| 1 | SIMPLE | b | NULL | ALL | idx_fk_city_id | NULL | NULL | NULL | 603 | 33.33 | Using where |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 2 | sakila.b.city_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+----------------+---------+---------+------------------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
第一个操作返回多行数据(603),在第二个操作中使用第一个操作中的city_id匹配表city中的数据,第一个操作中的每一个city_id匹配第二个操作的表(a的主键)中的一行数据,访问类型是eq_ref.
可以看出,当索引列用等号比较时,可以用到eq_ref.
ref : 对于外表中的行,从表中读取多行。可以用于等于或不等于操作符,或者是非唯一索引的情况下,比如下面的例子:
mysql> explain select a.city, b.country from city a , country b where a.country_id=b.country_id;
+----+-------------+-------+------------+------+-------------------+-------------------+---------+---------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+---------------------+------+----------+-------+
| 1 | SIMPLE | b | NULL | ALL | PRIMARY | NULL | NULL | NULL | 109 | 100.00 | NULL |
| 1 | SIMPLE | a | NULL | ref | idx_fk_country_id | idx_fk_country_id | 2 | sakila.b.country_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+------+-------------------+-------------------+---------+---------------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
对于b(country)表的每一country_id, city表中有多行数据对应,join类型就是ref。
range:检索指定范围的值。key列值出用到的索引。
mysql> explain select * from city where city_id<3;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
| 1 | SIMPLE | city | NULL | range | PRIMARY | PRIMARY | 2 | NULL | 2 | 100.00 | Using where |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
index和all:这两个都是全扫,index扫描的对象时索引,all扫描的是表, 表的全扫比较常见,这里给出一个对索引进行全扫的例子。
mysql> explain select country_id from city;
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+-------------+
| 1 | SIMPLE | city | NULL | index | NULL | idx_fk_country_id | 2 | NULL | 600 | 100.00 | Using index |
+----+-------------+-------+------------+-------+---------------+-------------------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
3.4 extra
extra列里有时会包含非常重要的信息,在查看执行计划时,尤其要关注extra列里的using filesort和using temporary ,这表示在语句的执行过程中执行了排序或临时表
的创建。
Using filesort:MySQL必须一次额外的排序操作。必须重新扫描一遍获得的数据,存储排序key和指向行的指针。然后将key排序,按照顺序检索行。可以看出这个操作非常消耗资源,对sql语句执行的性能影响是巨大的。
mysql> explain select city ,last_update from city order by last_update;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
| 1 | SIMPLE | city | NULL | ALL | NULL | NULL | NULL | NULL | 600 | 100.00 | Using filesort |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------+
1 row in set, 1 warning (0.01 sec)
extra列的信息是Using filesort,MySQL尽可能使这个filesort在内存中完成,如果需要排序的行过多,就会在磁盘上执行,这回进步降低性能。可以在last_update列上创建索引以优化这个sql,由于索引是有序的,也就避免了额外的排序操作。
using temporary:为了获得sql语句的执行结果,mysql必须创建一个临时表用来存储中间结果,这个通常发生在order和group by使用了不同的列时。比如下面的例子:
mysql> explain select a.country, a.country_id, count(b.city_id) from country a, city b where a.country_id=b.country_id g
roup by a.country_id order by a.country;
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+---------------------+------+----------+---------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+---------------------+------+----------+---------------------------------+
| 1 | SIMPLE | a | NULL | index | PRIMARY | PRIMARY | 2 | NULL | 109 | 100.00 | Using temporary; Using filesort |
| 1 | SIMPLE | b | NULL | ref | idx_fk_country_id | idx_fk_country_id | 2 | sakila.a.country_id | 5 | 100.00 | Using index |
+----+-------------+-------+------------+-------+-------------------+-------------------+---------+---------------------+------+----------+---------------------------------+
2 rows in set, 1 warning (0.00 sec)
这里group的列是country_id,排序的列是country_name,extra的值是Using temporary; Using filesort。
Using join buffer:join时出现这个信息是MySQL将前面表的结果的一部分存储到join 缓冲区内,和内表进行匹配,这样比每一行做一次匹配性能上要提高不少。看例子之前先做点准备工作,创建两个没有索引的表。
mysql> create table country_no_idx as select * from country;
Query OK, 109 rows affected (0.03 sec)
Records: 109 Duplicates: 0 Warnings: 0
mysql> create table city_no_idx as select * from city;
Query OK, 600 rows affected (0.03 sec)
Records: 600 Duplicates: 0 Warnings: 0
下面的执行计划使用了join buffer
mysql> explain select a.country, b.city from country_no_idx a, city_no_idx b where a.country_id=b.country_id;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
| 1 | SIMPLE | a | NULL | ALL | NULL | NULL | NULL | NULL | 109 | 100.00 | NULL |
| 1 | SIMPLE | b | NULL | ALL | NULL | NULL | NULL | NULL | 600 | 10.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
join buffer是MySQL对嵌套连接的一种优化,优化效果未必理想,可以使用索引来避免使用join buffer,具体做法是在内表(如果是内连接,在行数较大的表上)上创建一个索引。
4 使用物化优化子查询
MySQL优化器使用物化优化子查询。所谓的物化创建一个临时表,用来缓存子查询的结果,这个表通常在内存中创建。当MySQL第一次需要子查询的结果时,对子查询进行物化,优化器也可能对物化的临时表创建哈希所以以加速查询。物化临时表尽可能存在与内存中,表太大时会在磁盘上存储。
下面是子查询物化优化的一个例子。
mysql> explain select * from city where country_id in (select country_id from country_no_idx where country_id=100);
+----+--------------+----------------+------------+--------+-------------------+-------------------+---------+------------------------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+----------------+------------+--------+-------------------+-------------------+---------+------------------------+------+----------+-----------------------+
| 1 | SIMPLE | city | NULL | ref | idx_fk_country_id | idx_fk_country_id | 2 | const | 6 | 100.00 | Using index condition |
| 1 | SIMPLE | | NULL | eq_ref | | | 2 | sakila.city.country_id | 1 | 100.00 | Using where |
| 2 | MATERIALIZED | country_no_idx | NULL | ALL | NULL | NULL | NULL | NULL | 109 | 10.00 | Using where |
+----+--------------+----------------+------------+--------+-------------------+-------------------+---------+------------------------+------+----------+-----------------------+
3 rows in set, 1 warning (0.00 sec)
从上面例子中的id为3的操作中看到,MySQL优化器对子查询的结果进行了物化,并且创建了哈希索引,这从id 2操作的possible keys和key的值可以看出。
对子查询进行物化虽然对sql的性能有一定提升,但是频繁的创建物化临时表对内存和性能都会有一定的压力,对这类子查询进一步优化的措施是子查询使用唯一索引或主键查询,
mysql> explain select * from city where country_id in (select country_id from country where country_id=100);
+----+-------------+---------+------------+-------+-------------------+-------------------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+-------+-------------------+-------------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | country | NULL | const | PRIMARY | PRIMARY | 2 | const | 1 | 100.00 | Using index |
| 1 | SIMPLE | city | NULL | ref | idx_fk_country_id | idx_fk_country_id | 2 | const | 6 | 100.00 | NULL |
+----+-------------+---------+------------+-------+-------------------+-------------------+---------+-------+------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)