MySQL中组合索引使用小结

总结一些MySQL中组合索引，什么情况下会走索引，什么情况下无法走到索引

测试表结构及数据

CREATE TABLE `t` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `idx_a_b` (`a`,`b`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

insert into t values (1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3);
下面这两条语句是为了测试场景1中，能出现跳跃索引扫描
INSERT INTO t SELECT a, b + 5 FROM t;
INSERT INTO t SELECT a, b + 10 FROM t;

场景1where条件后只带有组合索引的第二个字段

root@localhost:mysql-8.0-3319.sock  16:26:33 [dhy]> explain select a,b from t where b = 2;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                  |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
|  1 | SIMPLE      | t     | NULL       | range | idx_a_b       | idx_a_b | 10      | NULL |    3 |   100.00 | Using where; Using index for skip scan |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+----------------------------------------+
1 row in set, 1 warning (0.00 sec)

从执行计划中可以看到，key这列显示使用到索引，同时key_ley的长度是10，也就是使用到了组合索引的所有字段，Extra中显示走了跳跃索引扫描(Using index for skip scan)，如果是8.0以下的版本则不会出现跳跃索引，而是索引全表扫描。
这里执行时的流程应该是如下发生:

获取组合索引中，第一列去重之后的第一个值(本例子中就是a=1)
根据此值与组合索引中的第二列构成一个查询条件(a=1 and b = 2)
执行查询
获取组合索引中，第一列去重之后的第二个值(本例子中就是a=2)
根据此值与组合索引中的第二列构成一个查询条件(a=2 and b = 2)
执行查询

反复按以上规则执行，直到第一列去重之后的值全部扫描完成

场景2where条件后第一列是范围查询

mysql> explain select * from t where a > 1 and b = 2;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                    |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
|  1 | SIMPLE      | t     | NULL       | range | idx_a_b       | idx_a_b | 5       | NULL |    6 |    11.11 | Using where; Using index |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

从这个执行计划中看到，key_len是5，也就是只使用到了组合索引的一部分，其实这里只使用到了组合索引的前半部分。
为何只能使用前半部分呢？我们再看下表中数据情况

mysql> select * from t ;
+------+------+
| a    | b    |
+------+------+
|    1 |    1 |
|    1 |    2 |
|    1 |    3 |
|    2 |    1 |
|    2 |    2 |
|    2 |    3 |
|    3 |    1 |
|    3 |    2 |
|    3 |    3 |
+------+------+
9 rows in set (0.00 sec)

当a>1这个where条件过滤后，数据就变成了

mysql> select * from t where a > 1 ;
+------+------+
| a    | b    |
+------+------+
|    2 |    1 |
|    2 |    2 |
|    2 |    3 |
|    3 |    1 |
|    3 |    2 |
|    3 |    3 |
+------+------+
6 rows in set (0.00 sec)

这里a字段是排序的，但是b字段则是无序的，所以第二列是无法用到索引的。

同理我们就知道了，只要第一列是范围查询，后面的第二列则无法用到索引，向下面这种order by和group by也是同样的

mysql> explain select * from t where a > 1 order by b ;
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+------------------------------------------+
| id | select_type | table | partitions | type  | possible_keys | key     | key_len | ref  | rows | filtered | Extra                                    |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+------------------------------------------+
|  1 | SIMPLE      | t     | NULL       | range | idx_a_b       | idx_a_b | 5       | NULL |    6 |   100.00 | Using where; Using index; Using filesort |
+----+-------------+-------+------------+-------+---------------+---------+---------+------+------+----------+------------------------------------------+
1 row in set, 1 warning (0.00 sec)

可以看到这里使用到了排序，在MySQL8.0中explain 后面加上format=tree可以更加清楚看到执行顺序:

oot@localhost:mysql-8.0-3319.sock  22:07:57 [dhy]>explain format=tree select * from t where a > 1 order by b ;
+---------------------------------------------------------------------------------------+
| EXPLAIN                                                                               |
+---------------------------------------------------------------------------------------+
| -> Sort: t.b
    -> Filter: (t.a > 1)
        -> Index range scan on t using idx_a_b
 |
+---------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

先根据索引做范围查询，过滤掉不满足条件(a>1)的数据，然后在根据b字段排序

场景3where条件后第一列是等值查询

mysql> explain select * from t where a =1 and b = 2 ;
+----+-------------+-------+------------+------+---------------+---------+---------+-------------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key     | key_len | ref         | rows | filtered | Extra       |
+----+-------------+-------+------------+------+---------------+---------+---------+-------------+------+----------+-------------+
|  1 | SIMPLE      | t     | NULL       | ref  | idx_a_b       | idx_a_b | 10      | const,const |    1 |   100.00 | Using index |
+----+-------------+-------+------------+------+---------------+---------+---------+-------------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

这个执行计划就非常好了，用到了索引，并且key_len是10，同时select_type是ref代表是索引扫描(不是全扫描)。
之所以这里能使用组合索引两个字段做索引扫描的原因，是因为根据第一例的值查询后，第二列的值也是排序的，所以能利用到第二列的索引