Carson
Carson

Reputation: 17741

How can I optimize this confusingly slow query in MySQL?

I have a table of blog posts, each with a foreign key back to it's author. There are < 15,000 entries in this table. This query scans over 19,000 rows (per EXPLAIN), requires a filesort (that might be regular MySQL behavior), and takes over 400ms to return 5 rows. possibly because of the complicated WHERE used to check if the item is actually published.

Dearest Stack Overflow, how can I wrangle this query under control?

Note: while this criteria might be up for simplification, all of the conditions are required.

SELECT      `blog_post.id`, 
            `blog_post.title`,
            `blog_post.author_id`,
            `blog_post.has_been_fact_checked`,
            `blog_post.published_date`,
            `blog_post.ordering`,
            `auth_user.username`,
            `auth_user.email`
FROM        `blog_post` 
INNER JOIN  `auth_user` 
ON          (`blog_post`.`author_id` = `auth_user`.`id`) 
WHERE       (`blog_post`.`is_approved` = True  AND 
             `blog_post`.`has_been_fact_checked` = True  AND 
             `blog_post`.`published_date` IS NOT NULL AND 
             `blog_post`.`published_date` <= '2010-10-25 22:40:05' ) 
ORDER BY    `blog_post`.`published_date` DESC, 
            `blog_post`.`ordering` ASC, 
            `blog_post`.`id` DESC 
LIMIT 5

Aside from the PKs, I have the following indexes on the table:

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date)
idx_pub_date -> blog_post(published_date)

The output from EXPLAIN looks like this:

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: blog_post
         type: ref
possible_keys: blog_post_author_id,idx_published_blog_post,idx_pub_date
          key: idx_published_blog_post
      key_len: 4
          ref: const,const
         rows: 19856
        Extra: Using where; Using filesort
*************************** 2. row ***************************
           id: 1
  select_type: SIMPLE
        table: auth_user
         type: eq_ref
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: blog.blog_post.author_id
         rows: 1
        Extra: Using index
2 rows in set (0.00 sec)

Side-note: 2010-10-25 22:40:05 is just a date generated by the code that executes this query.

Thanks so much for any & all help!

Upvotes: 4

Views: 439

Answers (3)

Quassnoi
Quassnoi

Reputation: 425251

MySQL does not support ASC/DESC clauses in indexes.

You would need to create a separate column called reverse_ordering and set its value to -ordering (provided that ordering is an numeric value)

Then you could create the following index:

CREATE INDEX ix_blogpost_a_c_p_ro_id ON blog_post (is_approved, has_been_fact_checked, published_date, reverse_ordering, id)

and rewrite your query:

SELECT      `blog_post.id`, 
            `blog_post.title`,
            `blog_post.author_id`,
            `blog_post.has_been_fact_checked`,
            `blog_post.published_date`,
            `blog_post.ordering`,
            `auth_user.username`,
            `auth_user.email`
FROM        `blog_post` 
INNER JOIN  `auth_user` 
ON          `blog_post`.`author_id` = `auth_user`.`id`
WHERE       `blog_post`.`is_approved` = 1 AND 
            `blog_post`.`has_been_fact_checked` = 1 AND 
            `blog_post`.`published_date` <= '2010-10-25 22:40:05'
ORDER BY    `blog_post`.`published_date` DESC, 
            `blog_post`.`reverse_ordering` DESC, 
            `blog_post`.`id` DESC 
LIMIT 5

You may get rid of IS NULL check, since the inequality condition implies it.

Update:

You also might want to read this article:

Upvotes: 5

Jaydee
Jaydee

Reputation: 4158

To me it looks like the filesort may be killing the speed. If you can get the ORDER BY fields into the index that is being used you may get a speed increase. Try changing:

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date)

to

idx_published_blog_post -> blog_post(is_approved, has_been_fact_checked, published_date DESC, ordering ASC, id DESC)

A couple of after thoughts: under what circumstances do you have a null published_date, searching in a date range may be quicker? Also published_date seems to be a date time field, so when sorting do you really have so may posts each second that you require the other sorting fields?

Upvotes: 0

user319198
user319198

Reputation:

Make a view of "blog_post" with all conditions you have applied in query (where clause) and do the joining of "auth_user" direct with this view.

feel free ask if not clear. :)

Upvotes: 0

Related Questions