stergosz
stergosz

Reputation: 5860

optimizing query order by results to Using filesort;

Query :

    SELECT
        r.reply_id,
        r.msg_id,
        r.uid,
        r.body,
        r.date,
        u.username as username,
        u.profile_picture as profile_picture
    FROM
        pm_replies as r
        LEFT JOIN users as u
            ON u.uid = r.uid
    WHERE
        r.msg_id = '784351921943772258'

    ORDER BY r.date DESC

i tried all index combinations i could think of, searched in google how best i could index this but nothing worked.

this query takes 0,33 on 500 returned items and counting...


EXPLAIN:

id  select_type     table   type    possible_keys   key     key_len     ref     rows    Extra
1   SIMPLE  r   ALL     index1  NULL    NULL    NULL    540     Using where; Using filesort
1   SIMPLE  u   eq_ref  uid     uid     8   site.r.uid  1   

SHOW CREATE pm_replies

CREATE TABLE `pm_replies` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `reply_id` bigint(20) NOT NULL,
 `msg_id` bigint(20) NOT NULL,
 `uid` bigint(20) NOT NULL,
 `body` text COLLATE utf8_unicode_ci NOT NULL,
 `date` datetime NOT NULL,
 PRIMARY KEY (`id`),
 KEY `index1` (`msg_id`,`date`,`uid`)
) ENGINE=MyISAM AUTO_INCREMENT=541 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

SHOW CREATE users

CREATE TABLE `users` (
 `id` bigint(20) NOT NULL AUTO_INCREMENT,
 `uid` bigint(20) NOT NULL,
 `username` varchar(20) COLLATE utf8_unicode_ci NOT NULL,
 `email` text CHARACTER SET latin1 NOT NULL,
 `password` text CHARACTER SET latin1 NOT NULL,
 `profile_picture` text COLLATE utf8_unicode_ci NOT NULL,
 `date_registered` datetime NOT NULL,
 PRIMARY KEY (`id`),
 UNIQUE KEY `uid` (`uid`),
 UNIQUE KEY `username` (`username`)
) ENGINE=MyISAM AUTO_INCREMENT=2004 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

Upvotes: 2

Views: 211

Answers (5)

DRapp
DRapp

Reputation: 48179

It appears the optimizer is trying to force the index by ID to make the join to the user table. Since you are doing a left-join (which doesn't make sense since I would expect every entry to have a user ID, thus a normal INNER JOIN), I'll keep it left join.

So, I would try the following. Query just the replies based on the MESSAGE ID and order by the date descending on its own merits, THEN left join, such as

SELECT
        r.reply_id,
        r.msg_id,
        r.uid,
        r.body,
        r.date,
        u.username as username,
        u.profile_picture as profile_picture
    FROM
        ( select R2.* 
             from pm_replies R2
             where r2.msg_id = '784351921943772258' ) r
        LEFT JOIN users as u
            ON u.uid = r.uid
    ORDER BY
        r.date DESC

In addition, since I don't have MySQL readily available, and can't remember if order by is allowed in a sub-query, if so, you can optimize the inner prequery (using alias "R2") and put the order by there, so it uses the (msgid, date) index and returns just that set... THEN joins to user table on the ID which no index is required at that point from the SOURCE result set, just the index on the user table to find the match.

Upvotes: 0

Branko Dimitrijevic
Branko Dimitrijevic

Reputation: 52157

What Dems is saying should be correct, but there is one additional detail if you are using InnoDB: perhaps you are paying the price of secondary indexes on clustered tables - essentially, accessing a row through the secondary index requires additional lookup trough the primary, i.e. clustering index. This "double lookup" might make the index less attractive to the query optimizer.

To alleviate this, try covering the all the fields in your select statement with the index:

pm_replies: (msg_id, date, uid, reply_id, body, date)
users:      (uid, username, profile_picture)

Upvotes: 0

davidethell
davidethell

Reputation: 12048

Add date to your index1 key so that msg_id and date are both in the index.

Upvotes: 0

MatBailie
MatBailie

Reputation: 86808

For the query as it is, the best indexes would seem to be...

pm_replies: (msg_id, date, uid)
users:      (uid)

The important one is pm_replies. You use it to both filter your data (the filter column is first) then order your data (the order column is second).

The would be different if you removed the filter. Then you'd just want (date, uid) as your index.

The last field in the index just makes it a fraction friendlier to the join, the important part is actually the index on users.

There is a lot more that coudl be said on this, a whole chapter in a book at the very least, and several books if your wanted to. But I hope this helps.


EDIT

Not that my suggested index for pm_replies is one index covering three fields, and not just three indexes. This ensures that all the entries in the index are pre-sorted by those columns. It's like sorting data in Excel by three columns.

Having three separate indexes is like having the Excel data on three tabs. Each sorted by a different fields.

Only whith one index over three fields do you get this behaviour...
- You can select one 'bunch' of records with the same msg_id
- That whole 'bunch' are next to each other, no gaps, etc
- That whole 'bunch' are sorted in date order for that msg_id
- For any rows with the same date, they're ordered by user_id

(Again the user_id part is really very minor.)

Upvotes: 5

Mirodil
Mirodil

Reputation: 2329

Please try this:

SELECT
        r.reply_id,
        r.msg_id,
        r.uid,
        r.body,
        r.date,
        u.username as username,
        u.profile_picture as profile_picture
    FROM
        pm_replies as r
        LEFT JOIN users as u
            ON (u.uid = r.uid AND r.msg_id = '784351921943772258')
    ORDER BY r.date DESC

in my case it help.

Upvotes: 0

Related Questions