Ibiza
Ibiza

Reputation: 149

MySQL query too slow, 70k rows 12 sec

This query runs for 12+ seconds on VPS. It joins 3 tables. Only first one "topcics" has about 70k rows, others are about 20 and "post_cc" about 1500.

SELECT topics.*, employee.username, accounts.ac_name, accounts.ac_mail
FROM topics
INNER JOIN employee ON employee.id_user = topics.id_owner 
INNER JOIN accounts ON accounts.id_account = topics.id_account 
WHERE topics.status  IN  ('1','3') AND ( topics.id_owner IN (12, 5) OR topics.id_post IN 
    (SELECT DISTINCT(id_post) FROM post_cc WHERE id_employee IN (12, 5) ) )
ORDER BY topics.creationdate DESC LIMIT 0,25

I have already tried (without any improvement) to remove subquery and first "employee" join. If I remove "accounts" join, query runs under 0.1 sec, but all tables data are needed for sorting purpose during paging.

Explain:

+----+--------------------+------------+-----------------+-----------------------+---------+---------+-----------------+-------+----------------------------------------------+
| id | select_type        | table      | type            | possible_keys         | key     | key_len | ref             | rows  | Extra                                        |
+----+--------------------+------------+-----------------+-----------------------+---------+---------+-----------------+-------+----------------------------------------------+
|  1 | PRIMARY            | topics     | ALL             | id_owner,id_account   | NULL    | NULL    | NULL            | 75069 | Using where; Using temporary; Using filesort |
|  1 | PRIMARY            | accounts   | ALL             | PRIMARY               | NULL    | NULL    | NULL            |     5 | Using where; Using join buffer               |
|  1 | PRIMARY            | employee   | eq_ref          | PRIMARY               | PRIMARY | 3       | topics.st_owner |     1 | Using where                                  |
|  2 | DEPENDENT SUBQUERY | post_cc    | unique_subquery | PRIMARY               | PRIMARY | 8       | func,const      |     1 | Using index; Using where                     |
+----+--------------------+------------+-----------------+-----------------------+---------+---------+-----------------+-------+----------------------------------------------+

I have added suggested keys as index, it improved time for 2 sec., but it's still too slow.

Shortened tables:

topics
+--------------------+---------------------+------+-----+---------+----------------+
| Field              | Type                | Null | Key | Default | Extra          |
+--------------------+---------------------+------+-----+---------+----------------+
| id_post            | int(10) unsigned    | NO   | PRI | NULL    | auto_increment |
| id_account         | int(10) unsigned    | YES  | MUL | 0       |                |
| mail               | varchar(256)        | YES  | MUL | NULL    |                |
| from_name          | varchar(512)        | YES  |     | NULL    |                |
| title              | varchar(512)        | YES  |     | NULL    |                |
| content            | text                | YES  |     | NULL    |                |
| id_owner           | int(10) unsigned    | YES  | MUL | NULL    |                |
| creationdate       | datetime            | YES  |     | NULL    |                |
+--------------------+---------------------+------+-----+---------+----------------+

employee
+---------------------+-----------------------+------+-----+---------+----------------+
| Field               | Type                  | Null | Key | Default | Extra          |
+---------------------+-----------------------+------+-----+---------+----------------+
| id_employee         | mediumint(8) unsigned | NO   | PRI | NULL    | auto_increment |
| id_user             | mediumint(8) unsigned | NO   |     | NULL    |                |
| id_owner            | tinyint(1)            | YES  |     | 0       |                |
| active              | tinyint(1)            | YES  |     | 1       |                |
| username            | varchar(64)           | YES  |     | NULL    |                |
| email               | varchar(128)          | YES  |     | NULL    |                |
+---------------------+-----------------------+------+-----+---------+----------------+

accounts
+----------------------------+---------------------+------+-----+---------+----------------+
| Field                      | Type                | Null | Key | Default | Extra          |
+----------------------------+---------------------+------+-----+---------+----------------+
| id_account                 | int(10) unsigned    | NO   | PRI | NULL    | auto_increment |
| ac_mail                    | int(10) unsigned    | YES  | UNI | NULL    |                |
| ac_name                    | varchar(512)        | YES  |     | NULL    |                |
| last_sync_time             | datetime            | YES  |     | NULL    |                |
+----------------------------+---------------------+------+-----+---------+----------------+

post_cc
+------------------------+---------------------+------+-----+---------+-------+
| Field                  | Type                | Null | Key | Default | Extra |
+------------------------+---------------------+------+-----+---------+-------+
| id_post                | int(10) unsigned    | NO   | PRI | NULL    |       |
| id_employee            | int(10) unsigned    | NO   | PRI | NULL    |       |
| notifications          | tinyint(3) unsigned | YES  |     | 1       |       |
+------------------------+---------------------+------+-----+---------+-------+

Upvotes: 3

Views: 339

Answers (3)

RandomSeed
RandomSeed

Reputation: 29759

The culprit:

75069 (rows) | Using where; Using temporary; Using filesort

That's what we need to get rid of.

Possible solution: add an index on topics.creationdate.

As a side note, the query also has conditions on id_post, st_owner, and status, therefore a composite index on topics(creationdate, id_post, st_owner, status) (or any permutation of the last three columns -- test with your data set) might help even further. However, your query seems to pull most of your table anyways, so I expect a simple index will suffice.

Upvotes: 0

spencer7593
spencer7593

Reputation: 108390

One likely suspect is that DEPENDENT SUBQUERY.

MySQL is processing that subquery for each row returned by the outer query (which isn't already filtered out by some other predicate.

To improve performance, consider re-writing that either as a JOIN operation or an EXISTS predicate.

To replace that with a JOIN operation, that will need to be an OUTER JOIN (rather than an INNER JOIN) because of the OR in the predicate.

As an example of one way to do that:

SELECT topics.*
     , employee.username
     , accounts.ac_name
     , accounts.ac_mail
  FROM topics
  JOIN employee ON employee.id_user = topics.id_owner 
  JOIN accounts ON accounts.id_account = topics.id_account
  LEFT
  JOIN ( SELECT DISTINCT q.id_post
           FROM post_cc q 
          WHERE q.id_employee IN (12, 5) 
       ) p
    ON p.id_post = topics.id_post   
 WHERE topics.status IN ('1','3') 
   AND ( topics.id_owner IN (12, 5) 
       OR p.id_post IS NOT NULL
       )
 ORDER BY topics.creationdate DESC LIMIT 0,25

I recommend you run an EXPLAIN on that, and see how that performs.


Another option is to consider an EXISTS predicate. Occassinally we can get this to perform better, but often times not.

SELECT topics.*
     , employee.username
     , accounts.ac_name
     , accounts.ac_mail
  FROM topics
  JOIN employee ON employee.id_user = topics.id_owner 
  JOIN accounts ON accounts.id_account = topics.id_account
 WHERE topics.status IN ('1','3') 
   AND ( topics.id_owner IN (12, 5) 
       OR EXISTS ( SELECT 1 
                     FROM post_cc q
                    WHERE q.id_employee IN (12, 5)
                      AND q.id_post = topics.id_post
                 )
       )
 ORDER BY topics.creationdate DESC LIMIT 0,25

For performance, that's going to almost require a suitable covering index for the subquery in the EXISTS clause, for example:

ON post_cc (id_post, id_employee)

You can try running an EXPLAIN and see how that performs as well.


We see that MySQL isn't using an index on the topics table.

We might get MySQL to avoid an expensive "Using filesort" operation if we had an index with a leading column of creationdate.

And part of the problem is likely that OR in the predicate. We might try re-writing that query as two separate queries, and combining them with a UNION ALL set operation. But if we do that, we'd really like to see an index on topic being used (we probably won't improve performance by incurring two scans of 70,000 rows.

SELECT topics.*
     , employee.username
     , accounts.ac_name
     , accounts.ac_mail
  FROM topics
  JOIN employee ON employee.id_user = topics.id_owner 
  JOIN accounts ON accounts.id_account = topics.id_account
 WHERE topics.status IN ('1','3')
   AND topics.id_owner IN (12, 5)

 UNION ALL

SELECT topics.*
     , employee.username
     , accounts.ac_name
     , accounts.ac_mail
  FROM topics
  JOIN employee ON employee.id_user = topics.id_owner 
  JOIN accounts ON accounts.id_account = topics.id_account
  JOIN ( SELECT DISTINCT q.id_post
           FROM post_cc q 
          WHERE q.id_employee IN (12, 5) 
       ) p
    ON p.id_post = topics.id_post  
 WHERE topics.status IN ('1','3')
   AND ( topics.id_owner NOT IN (12, 5) OR topics.id_owner IS NULL )

 ORDER BY 8 DESC LIMIT 0,25

With a query of that form, we're more likely to get MySQL use a suitable index on the topics table,

... ON topics (id_owner, status)

... ON topics (id_post, status, id_owner)

Upvotes: 2

Tomi
Tomi

Reputation: 245

Why don't you use left join for post_cc table, and then use condition?

Something like this.

SELECT topics.*, 
employee.username, 
accounts.ac_name, 
accounts.ac_mail
FROM topics
INNER JOIN employee ON employee.id_user = topics.id_owner 
INNER JOIN accounts ON accounts.id_account = topics.id_account 
LEFT JOIN post_cc ON id_employee IN (12, 5)
WHERE topics.status  IN  ('1','3') 
AND ( topics.id_owner IN (12, 5) OR topics.id_post IN (post_cc.post_id)
ORDER BY topics.creationdate DESC LIMIT 0,25;

Upvotes: 0

Related Questions