paul78
paul78

Reputation: 85

Simple MySQL Subquery Performance

Consider the 2 MySQL queries:

SELECT ue.userid,e.courseid 
FROM (SELECT id,courseid FROM mdl_enrol WHERE status = 0 AND courseid IN (46)) e 
INNER JOIN (SELECT enrolid,userid FROM mdl_user_enrolments ) ue ON ue.enrolid = e.id 
INNER JOIN (SELECT userid FROM mdl_userdata) ud ON ue.userid = ud.userid

--

SELECT ue.userid,e.courseid 
FROM mdl_enrol e 
INNER JOIN mdl_user_enrolments ue ON ue.enrolid = e.id 
INNER JOIN mdl_userdata ud ON ue.userid = ud.userid
WHERE e.status = 0 AND e.courseid IN (46)

The bottom query is much much faster than the top query, but why? I've read that to increase performance you should only select the columns you need. Also, to me the top query should perform better because in each JOIN, you're reducing the amount of data you're joining. Obviously my understanding of how databases work is wrong, but if anyone could clear this up it would be much appreciated. An EXPLAIN also confirms that the bottom query is much faster.

Many thanks.

Upvotes: 2

Views: 2551

Answers (1)

zerkms
zerkms

Reputation: 254926

In the first query mysql should select a subset from mdl_enrol table and complete mdl_user_enrolments and mdl_userdata into the memory. So you select a lot of data in memory. After you've done that - you join the data. If there is not enough memory to put all the data until it's joined and sent back to the client - then temporary table on the hard drive is created. Most likely mysql optimizer isn't cool enough to fix your mistake and try to improve execution plan. That's why it's slow.

Whereas for the second query mysql knows what exactly it needs to select and only selects the small amount of required data. In this scenario it's possible to use indexes (assuming all the necessary indexes have been created). So it's fast.

Upvotes: 4

Related Questions