MySQL数据库在使用group by查询时默认会进行排序,有时候我们并不需要这种排序,消除这种排序有两种方式,本文介绍其中一种,即使用索引消除排序,这种方式还有一个附加的好处,就是避免临时表的创建。
文中的示例选择My SQL官方的sakila数据库。
1 创建测试表格
create table film_category_cp as select * from film_category;
2 创建的表上没有任何索引, 查看一个简单使用group by 查询语句执行计划
explain select category_id, count(*) from film_category_cp group by category_id\G;
输出结果(删除了值为空和没有意义的内容,以后的输出依此处理)如下:
id: 1
select_type: SIMPLE
table: film_category_cp
rows: 1000filtered: 100.00
Extra: Using temporary; Using filesort
从上面的输出来看,查询的过程中需要创建临时并进行了排序。
3 在group by 列上创建一个索引
create index idx_category_id_cp on film_category_cp(category_id);
4 再次运行上面的查询,输出结果如下:
1. row **
id: 1
select_type: SIMPLE
table: film_category_cp
type:index
possible_keys:idx_category_id_cp
key: idx_category_id_cp
rows: 1000
filtered: 100.00
Extra: Using index
上面的输出结果中已经没有临时表和文件排序了,可见查询用到了索引。
5 再看一个稍微复杂点的例子,即两表关联查询下g roup by 查询应该怎样创建索引。
5.1 创建测试表格
create table customer_cp as select * from customer;
create table rental_cp as select * from rental;
5.2 运行量表关联的group by语句
explain select a.store_id, count(b.rental_id) from customer_cp a inner join rental_cp b on a.customer_id=b.customer_idgroup by a.store_id\G;
这条查询的结果如下:
- row
id: 1
select_type: SIMPLE
table: a
rows: 599
filtered: 100.00
Extra: Using temporary; Using filesort
- row
id: 1
select_type: SIMPLE
table: b
rows: 15831
filtered: 10.00
Extra: Using where; Using join buffer (Block Nested Loop)
从输出来看,第一个查询过程中创建了临时表,并对结果进行排序。
5.3 在关联键上创建索引
create index idx_id_customer on customer_cp(customer_id);
create index idx_cid_rental on rental_cp(customer_id);
5.4 再次运行上面的查询, 结果如下:
1. row **
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: idx_id_customer
key: NULL
rows: 599
filtered: 100.00
Extra: Using temporary; Using filesort
2. row **
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: idx_cid_rental
key: idx_cid_rental
key_len: 2
ref: sakila.a.customer_id
rows: 26
filtered: 100.00
Extra: NULL
第一个查询的执行计划没有变化,第二个查询使用到了索引,避免了join buffer的使用。
5.5 在group 列上创建索引
create index idx_sid_customer on customer_cp(store_id);
5.6 再次运行sql,结果如下
- row
id: 1
select_type: SIMPLE
table: a
type: ALL
possible_keys: idx_id_customer,idx_sid_customer
key: NULL
key_len: NULLref: NULL
rows: 599
filtered: 100.00
Extra: Using temporary; Using filesort
- row
id: 1
select_type: SIMPLE
table: b
type: ref
possible_keys: idx_cid_rental
key: idx_cid_rental
key_len: 2
ref: sakila.a.customer_id
rows: 26filtered: 100.00
Extra: NULL
只在group by 的语句上创建索引并不能消除排序和临时表
5.7 在关联 列上创建符合索引后运行上面的查询
create index idx_sid_cid on customer_cp(store_id, customer_id);
运行结果如下
- row
id: 1
select_type: SIMPLE
table: a
type: index
possible_keys: idx_sid_cid
key: idx_sid_cid
key_len: 3
rows: 599
filtered: 100.00
Extra: Using index
- row
id: 1
select_type: SIMPLE
table: b
partitions: NULL
type: ref
possible_keys: idx_cid_rental
key: idx_cid_rental
key_len: 2
ref: sakila.a.customer_id
rows: 26
filtered: 100.00
Extra: NULL
再看执行计划,已经没有临时表和排序了
6 另一个稍微复杂点的例子 group by 多个列的情况
6.1 使用sakila数据库的rental表做测试,这个表上已经建立了几个索引
1.row
Key_name: PRIMARY
Seq_in_index: 1
Column_name: rental_id
Index_type: BTREE
- row
Key_name: rental_date
Seq_in_index: 1
Column_name: rental_date
Index_type: BTREE
- row *
Key_name: rental_date
Seq_in_index: 2
Column_name: inventory_id
- row
Key_name: rental_date
Column_name: customer_id
- row
Key_name: idx_fk_inventory_id
Column_name: inventory_id
- row
Key_name: idx_fk_customer_id
Column_name: customer_id
- row
Key_name: idx_fk_staff_id
Column_name: staff_id
6.2 运行sql
explain select customer_id,staff_id , count(*) from rental group by customer_id,staff_id\G;
输出结果
id: 1
select_type: SIMPLE
table: rental
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 16008
filtered: 100.00
Extra: Using temporary; Using filesort
有临时表和排序
6.3 创建一个复合索引,索引键顺序按照group by 列的顺序
create index idx_c_s on rental(customer_id,staff_id);
6.4 再次运行sql,输出结果如下
id: 1
select_type: SIMPLE
table: rental
partitions: NULL
type: index
possible_keys: idx_c_s
key: idx_c_s
key_len: 3
ref: NULL
rows: 16008f
iltered: 100.00
Extra: Using index
使用索引消除了排序,sql执行时间也有所减少