Reputation: 5386
I've got a pice of code which looks up a few CD's in a database by their ID. This is done using an 'IN' condition :
(1) SELECT * FROM album WHERE id IN (?,?,?,?,?)
The next step is to get tracks associated with these albums. I do this by modifying the 'base' query a little.
(2) SELECT track.* FROM album
LEFT JOIN track ON track.album_id = album.id
WHERE album.id IN(?,?,?,?,?)
Now, I've got both albums and tracks. However, I need to load composer details for the tracks. For a couple of reasons, I can't do it along with the above query, so I need to do it separately.
What I could do is to look up composers based on the tracks I've got, where I would use the track ids and look up composers in the composers table based on these track ids. Alternatively, I could modify the 'base' query even further and do one more join with the composers table. However; is there a general rule here (with regards to performance) which easily point out one of the queries as favourable? I've been testing a little, but I've done it on such a small scale that I can't really see any difference...
(3) SELECT composer.* FROM album
LEFT JOIN track ON track.album_id = album.id
LEFT JOIN composer ON composer.track_id = track.id
WHERE album.id IN (?,?,?,?,?)
...or...
[get track ids from query (2)]
(4) SELECT composer.* FROM composer
WHERE composer.track_id IN (?,...);
For the record: I've got indexes in place on all criteria and join columns.
Upvotes: 4
Views: 6562
Reputation: 26753
There is no reason to use LEFT JOIN
in query 3 (nice of you to number them) since you only care about composers. (In general outer joins are slower.)
You don't need to join the albums table in query 3 - just use the IN
on track.album_id
. (I assume you are not worried about rogue tracks with missing albums.)
You mentioned you have indexes on various things. But remember that MySQL can only use one index per table per query. So if you have multiple things to check you must make a composite index.
Surprisingly doing a join is usually faster then a large IN statement, this is because the values in the IN are not indexed, so MySQL can not do an index join on them. But this only applies with lots of values - for small number of them using IN could be faster.
Personally, I would use the JOIN method until the point that you see this query becoming a problem. (Which would only happen if you had some very complex conditionals to check, that could get slow to do twice). The join is simpler code, and most likely will be super fast - so don't make things more complicated without a specific reason.
Upvotes: 6