Tal Galili
Tal Galili

Reputation: 25326

Optimizing a MySQL SELECT query?

How much can the following query be changed to improve its performance on a "large" database?

SELECT SQL_CALC_FOUND_ROWS wp_rb_posts.* 
FROM wp_rb_posts 
WHERE 1=1 
  AND ((guid = '956e208f101562f6654e88e9711276e4') 
   OR (guid = 'http://www.r-bloggers.com/?guid=956e208f101562f6654e88e9711276e4') 
   OR (guid = 'http://www.r-bloggers.com/?guid=956e208f101562f6654e88e9711276e4') 
   OR (MD5(guid) = '956e208f101562f6654e88e9711276e4')) 
 AND wp_rb_posts.post_type 
   IN ('post', 'page', 'attachment', 'revision', 'nav_menu_item') 
 AND (wp_rb_posts.post_status = 'publish' 
        OR wp_rb_posts.post_status = 'future' 
        OR wp_rb_posts.post_status = 'draft' 
        OR wp_rb_posts.post_status = 'pending' 
        OR wp_rb_posts.post_status = 'trash' 
        OR wp_rb_posts.post_status = 'auto-draft' 
        OR wp_rb_posts.post_status = 'inherit' 
        OR wp_rb_posts.post_status = 'private') 
ORDER BY wp_rb_posts.post_date DESC LIMIT 1400, 10;

Upvotes: 1

Views: 190

Answers (2)

Mark Wilkins
Mark Wilkins

Reputation: 41222

Without knowing anything about the table involved, I am making a guess based on the names of fields in the query. My guess is that the most restrictive portion of the query is the guid=... portion. It also appears from the structure of the query that the contents of that field is not completely well defined (sometimes it may have the md5 hash value and sometimes not). If this is correct, then it seems likely that the use of md5(guid) = '...' will mean that it cannot efficiently restrict on that portion of the query.

So if possible, it seems like it would make the search more efficient if you had a field that always contained the md5 hash value of the guid field. Then that field could be indexed and provide a much more efficient lookup for that portion of the query.

To try to clarify, consider just this query:

SELECT SQL_CALC_FOUND_ROWS wp_rb_posts.* 
FROM wp_rb_posts WHERE
   (guid = '956e208f101562f6654e88e9711276e4') OR
   (MD5(guid) = '956e208f101562f6654e88e9711276e4')) 

Assuming that MySQL does not allow indexes using scalar functions (which I don't think it does and if I'm wrong then this entire answer is pointless), then that query would require a full table scan computing md5(guid) on each record. Whereas the following query could be optimized much better (assuming an index existed on both fields:

SELECT SQL_CALC_FOUND_ROWS wp_rb_posts.* 
FROM wp_rb_posts WHERE
   (guid = '956e208f101562f6654e88e9711276e4') OR
   (md5_guid = '956e208f101562f6654e88e9711276e4') 

Upvotes: 0

Mike Veigel
Mike Veigel

Reputation: 3815

I would take a look at the execution plan for that query then make sure you have indices setup to optimize the query. Based on the execution plan, you may also see that re-ordering parts of the query will speed up performance.

To answer your question directly though the famous "It depends" answer is in effect here. I would have to see your DB, run the query, see the execution plan, see what is indexed and see how the order of the clauses generated by the execution plan effects performance etc.

Hope that helps you.

Upvotes: 2

Related Questions