Chance Hsu
Chance Hsu

Reputation: 73

Very slow query in ORDER BY with larger LIMIT range

MySQL 5.6, 64-bit, RHEL 5.8
A query on a large table with ORDER BY and LIMIT 'row_count' (or LIMIT 0,'row_count'). If the 'row_count' is larger then real count of result set, will be very very slow.

case 1: The query below is very fast (No 'LIMIT'):

mysql> SELECT * FROM syslog WHERE 
        (ReportedTime BETWEEN '2013-11-04' AND '2013-11-05') AND
        Priority<3 AND Facility=1 ORDER BY id DESC;
+---
| ...
6 rows in set (0.01 sec)

case 2: The query below is also fast ('LIMIT 5'):

mysql> SELECT * FROM syslog WHERE 
        (ReportedTime BETWEEN '2013-11-04' AND '2013-11-05') AND
        Priority<3 AND Facility=1 ORDER BY id DESC LIMIT 5;
+---
| ...
5 rows in set (0.42 sec)

case 3: The query below is very very slow ('LIMIT 7', may use any 'row_count' value > 6):

mysql> SELECT * FROM syslog WHERE 
        (ReportedTime BETWEEN '2013-11-04' AND '2013-11-05') AND
        Priority<3 AND Facility=1 ORDER BY id DESC LIMIT 7;
+---
| ...
6 rows in set (28 min 7.24 sec)

Difference is just only individual (No LIMIT), "LIMIT 5", and "LIMIT 7".
Why is case 3 so slow?
Some investigations in the case 3:

  1. Run command 'SHOW PROCESS', the State of the query is kept in 'Sending data'
    • Checked the server memory, it's still available enough.
    • Extended SESSION buffers 'read_buffer_size','read_rnd_buffer_size','sort_buffer_size' to very large amount (to 16MB) right before running query, but no help.
    • Also query only the column 'id' (SELECT id FROM syslog ....), but the same result.
  2. During the query is running, raised the same query but with row_count<5 (eg. 'LIMIT 5') in another mysql connection, the return of latter is still very soon.
  3. With different condition, for example, extend the time range BETWEEN '2013-10-03' to '2013-11-05' to gain result row count 149. With LIMIT 140, it's fast. With LIMIT 150, it's very very slow. So strange.

Currently in practice, in our website, the program gets the real result row count first (SELECT COUNT(*) FROM ..., No ORDER BY, No LIMIT), and afterwards do the query with the LIMIT 'row_count' value not exceeding the real row count got just now. Ugly.

The EXPLAIN for case 3:

-+-----..-+----..+-------+-----..+--------+---------+-----+-----+------------+
 | sele.. | table| type  | poss..| key    | key_len | ref | rows| Extra      |
-+-----..-+----..+-------+-----..+--------+---------+-----+-----+------------+
 | SIMPLE | syslo| index | ...   | PRIMARY| 8       | NULL| 132 | Using where|
-+-----..-+----..+-------+-----..+--------+---------+-----+-----+------------+
1 row in set (0.00 sec)

Table definition:

CREATE TABLE syslog (
    id          BIGINT NOT NULL AUTO_INCREMENT,
    ReceivedAt  TIMESTAMP NOT NULL DEFAULT 0,
    ReportedTime TIMESTAMP NOT NULL DEFAULT 0,
    Priority    SMALLINT,
    Facility    SMALLINT,
    FromHost    VARCHAR(60),
    Message     TEXT,
    InfoUnitID  INT NOT NULL DEFAULT 0,
    SysLogTag   VARCHAR(60) NOT NULL DEFAULT '',
    PRIMARY KEY (id),
    KEY idx_ReportedTime_Priority_id (ReportedTime,Priority,id),
    KEY idx_Facility (Facility),
    KEY idx_SysLogTag (SysLogTag(16)),
    KEY idx_FromHost (FromHost(16))
);

Upvotes: 3

Views: 2566

Answers (1)

Sebas
Sebas

Reputation: 21522

Mysql is famous for its behaviour around ORDER BY DESC + LIMIT clause.

See: http://www.mysqlperformanceblog.com/2006/09/01/order-by-limit-performance-optimization/

Please try:

SELECT * 
FROM syslog FORCE INDEX (Facility)
WHERE 
    ReportedTime BETWEEN '2013-11-04' AND '2013-11-05' 
AND Priority<3 
AND Facility=1 
ORDER BY id DESC 
LIMIT 7;

You need to force the use of the index used in first queries. (get it from their explain plans, KEY column)

Upvotes: 6

Related Questions