Reputation: 1982
I've got a problem with a MySQL query where the wrong (inefficient) index is used.
The table:
mysql> describe ADDRESS_BOOK;
+---------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+--------------+------+-----+---------+----------------+
| ADD_BOOK_ID | bigint(20) | NO | PRI | NULL | auto_increment |
| COMPANY_ID | bigint(20) | NO | MUL | NULL | |
| ADDRESS_NAME | varchar(150) | NO | MUL | NULL | |
| CLEAN_NAME | varchar(150) | NO | MUL | NULL | |
| ADDRESS_KEY_1 | varchar(150) | NO | MUL | NULL | |
| ADDRESS_KEY_2 | varchar(150) | NO | MUL | NULL | |
+---------------+--------------+------+-----+---------+----------------+
CLEAN_NAME is a 'cleaned' version of the normal ADDRESS_NAME where everything but [a-zA-Z] has been removed, ADDRESS_KEY1 and ADDRESS_KEY2 are the two longest words in ADDRESS_NAME, again everything but [a-zA-Z] removed.
These are my indexes (playing around with it trying to find the best):
mysql> SHOW INDEX FROM ADDRESS_BOOK;
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| ADDRESS_BOOK | 0 | PRIMARY | 1 | ADD_BOOK_ID | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FK_ADDRESS_BOOK_2 | 1 | COMPANY_ID | A | 36 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | IDX_ADDRESS_NAME | 1 | ADDRESS_NAME | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FX_ADDRESS_KEYS | 1 | CLEAN_NAME | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FX_ADDRESS_KEYS | 2 | ADDRESS_KEY_1 | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FX_ADDRESS_KEYS | 3 | ADDRESS_KEY_2 | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FX_ADDRESS_KEYS | 4 | COMPANY_ID | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FK_ADDRESS_2 | 1 | ADDRESS_KEY_2 | A | 18923 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FK_CLEAN | 1 | CLEAN_NAME | A | 37847 | NULL | NULL | | BTREE | | |
| ADDRESS_BOOK | 1 | FK_ADDRESS_1 | 1 | ADDRESS_KEY_1 | A | 37847 | NULL | NULL | | BTREE | | |
+--------------+------------+-------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
Now my query is:
select * from ADDRESS_BOOK addressboo0_
where (addressboo0_.CLEAN_NAME like concat('trad', '%')
or addressboo0_.ADDRESS_KEY_1 like concat('trad', '%')
or addressboo0_.ADDRESS_KEY_2 like concat('trad', '%'))
and addressboo0_.COMPANY_ID=1
order by addressboo0_.CLEAN_NAME asc
limit 200
There are users from different companies in the system, so a query should only return address book entries for the company of the user.
The explain for that is
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
| 1 | SIMPLE | addressboo0_ | ref | FK_ADDRESS_BOOK_2,FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FK_ADDRESS_BOOK_2 | 8 | const | 4108 | Using where; Using filesort |
+----+-------------+--------------+------+----------------------------------------------------------------------+-------------------+---------+-------+------+-----------------------------+
I know that MySQL can't use multicolumn indexes on or queries but as you can see it is using the index for COMPANY (FK_ADDRESS_BOOK_2) and not any of the indexes for the string columns!
If I take the company out from the query it will use the other indexes:
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
| 1 | SIMPLE | addressboo0_ | index_merge | FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FK_CLEAN,FK_ADDRESS_1,FK_ADDRESS_2 | 452,452,452 | NULL | 1089 | Using sort_union(FK_CLEAN,FK_ADDRESS_1,FK_ADDRESS_2); Using where; Using filesort |
+----+-------------+--------------+-------------+----------------------------------------------------+------------------------------------+-------------+------+------+-----------------------------------------------------------------------------------+
If I use the same query (incl company) for a different company it suddenly uses the multi-column index:
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
| 1 | SIMPLE | addressboo0_ | index | FK_ADDRESS_BOOK_2,FX_ADDRESS_KEYS,FK_ADDRESS_2,FK_CLEAN,FK_ADDRESS_1 | FX_ADDRESS_KEYS | 1364 | NULL | 492 | Using where |
+----+-------------+--------------+-------+----------------------------------------------------------------------+-----------------+---------+------+------+-------------+
So for company 1 it has 266 results while for company 16 it has 437. In total company 1 has 4109 entries while company 16 has 7745 entries.
So I am rather confused. Why is MySQL using the multi-column index FX_ADDRESS_KEYS for one company but the rather inefficient FK_ADDRESS_BOOK_2 for the other company (basically going through every single row for that company).
How can I improve the query/index? If I remove the or for ADDRESS_KEY_1 and ADDRESS_KEY_2 it is using the FX_ADDRESS_KEYS index but I lose the ability to search for Strings inside the name. If I use something like '%trade%' no index can be used.
Upvotes: 1
Views: 82
Reputation: 36137
If you want to have a pretty looking explain plan for this query, then try this:
CREATE INDEX FX_ADDRESS_KEYS_XX ON ADDRESS_BOOK(
COMPANY_ID,
CLEAN_NAME,
ADDRESS_KEY_1,
ADDRESS_KEY_2 );
This index should improve the query, but at some costs.
It contains a copy of almost the whole table (except 2 columns: ADD_BOOK_ID bigint(20)
and ADDRESS_NAME varchar(150)
) - it will take quite a lot of disk space.
And it for sure slow down inserts and updates, since index data must also be updated.
Upvotes: 1