Reputation: 50127
I'm hitting some quite major performances issues due to the use of "ORDER BY"-statements in my SQL-code.
Everything is fine as long as I'm not using ORDER BY-statements in the SQL. However, once I introduce ORDER BY:s in the SQL code everything slows down dramatically due to the lack of correct indexing. One would assume that fixing this would be trivial, but judging from forum discussions, etc this seems to be a rather common issue that I've yet to see a definitive and concise answer to this question.
Question: Given the following table ...
CREATE TABLE values_table ( id int(11) NOT NULL auto_increment, ... value1 int(10) unsigned NOT NULL default '0', value2 int(11) NOT NULL default '0', PRIMARY KEY (id), KEY value1 (value1), KEY value2 (value2), ) ENGINE=MyISAM AUTO_INCREMENT=2364641 DEFAULT CHARSET=utf8;
... how do I create indexes that will be used when querying the table for a value1-range while sorting on the value of value2?
Currently, the fetching is OK when NOT using the ORDER BY clause.
See the following EXPLAIN QUERY output:
OK, when NOT using ORDER BY: EXPLAIN select ... from values_table this_ where this_.value1 between 12345678 and 12349999 limit 10; +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+ | 1 | SIMPLE | this_ | range | value1 | value1 | 4 | NULL | 3303 | Using where | +----+-------------+-------+-------+---------------+----------+---------+------+------+-------------+
However, when using ORDER BY I get "Using filesort": EXPLAIN select ... from values_table this_ where this_.value1 between 12345678 and 12349999 order by this_.value2 asc limit 10; +----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------------+ | 1 | SIMPLE | this_ | range | value1 | value1 | 4 | NULL | 3303 | Using where; Using filesort | +----+-------------+-------+-------+---------------+----------+---------+------+------+-----------------------------+
Some additional information about the table content:
SELECT MIN(value1), MAX(value1) FROM values_table; +---------------+---------------+ | MIN(value1) | MAX(value2) | +---------------+---------------+ | 0 | 4294967295 | +---------------+---------------+ ... SELECT MIN(value2), MAX(value2) FROM values_table; +---------------+---------------+ | MIN(value2) | MAX(value2) | +---------------+---------------+ | 1 | 953359 | +---------------+---------------+
Please let me know if any further information is needed to answer the question.
Thanks a lot in advance!
Update #1: Adding a new composite index (ALTER TABLE values_table ADD INDEX (value1, value2);) does not solve the problem. You'll still get "Using filesort" after adding such an index.
Update #2: A constraint that I did not mention in my question is that I'd rather change the structure of the table (say adding indexes, etc.) than changing the SQL queries used. The SQL queries are auto-generated using Hibernate, so consider those more or less fixed.
Upvotes: 18
Views: 13848
Reputation: 882816
It appears to me that you have two totally independent keys, one for value1 and one for value2.
So when you use the value1 key to retrieve, the records aren't necessarily returned in order of value2, so they have to be sorted. This is still better than a full table scan since you're only sorting the records that satisfy your "where value1" clause.
I think (if this is possible in MySQL), a composite key on (value1,value2) would solve this.
Try:
CREATE TABLE values_table (
id int(11) NOT NULL auto_increment,
...
value1 int(10) unsigned NOT NULL default '0',
value2 int(11) NOT NULL default '0',
PRIMARY KEY (id),
KEY value1 (value1),
KEY value1and2 (value1,value2),
) ENGINE=MyISAM AUTO_INCREMENT=2364641 DEFAULT CHARSET=utf8;
(or the equivalent ALTER TABLE
), assuming that's the correct syntax in MySQL for a composite key.
In all databases I know (and I have to admit MySQL isn't one of them), that would cause the DB engine to select the value1and2 key for retrieving the rows and they would already be sorted in value2-within-value1 order, so wouldn't need a file sort.
You can still keep the value2 key if you need it.
Upvotes: 0
Reputation: 425863
You cannot use an index in this case, as you use a RANGE
filtering condition.
If you'd use something like:
SELECT *
FROM values_table this_
WHERE this_.value1 = @value
ORDER BY
value2
LIMIT 10
, then creating a composite index on (VALUE1, VALUE2)
would be used both for filtering and for ordering.
But you use a ranged condition, that's why you'll need to perform ordering anyway.
Your composite index will look like this:
value1 value2 ----- ------ 1 10 1 20 1 30 1 40 1 50 1 60 2 10 2 20 2 30 3 10 3 20 3 30 3 40
, and if you select 1
and 2
in value1
, you still don't get a whole sorted set of value2
.
If your index on value2
is not very selective (i. e. there are not many DISTINCT value2
in the table), you could try:
CREATE INDEX ix_table_value2_value1 ON mytable (value2, value1)
/* Note the order, it's important */
SELECT *
FROM (
SELECT DISTINCT value2
FROM mytable
ORDER BY
value2
) q,
mytable m
WHERE m.value2 >= q.value2
AND m.value2 <= q.value2
AND m.value1 BETWEEN 13123123 AND 123123123
This is called a SKIP SCAN
access method. MySQL
does not support it directly, but it can be emulated like this.
The RANGE
access will be used in this case, but probably you won't get any performance benefit unless DISTINCT value2
comprise less than about 1%
of rows.
Note usage of:
m.value2 >= q.value2
AND m.value2 <= q.value2
instead of
m.value2 = q.value2
This makes MySQL
perform RANGE
checking on each loop.
Upvotes: 19