Reputation: 7155
Is there a performance difference in the following?
SELECT person.id
FROM person
LEFT JOIN address ON person.id = address.personID
WHERE address.personID IS NOT NULL
vs
SELECT person.id
FROM person
INNER JOIN address ON person.id = address.personID
This query should show all person ID's that have an address record (not all do). The logical thing to do here would be to use an Inner Join as the second example shows. For reasons that are not entirely important (the query is being generated from a query builder), I may have to use the first approach.
Curious what the impact is. Does MySQL do a lot of extra work when it LEFT JOIN's and then compares that field against null to reduce the set? Maybe that's how INNER JOIN works behind the scenes anyway?
Upvotes: 11
Views: 8703
Reputation: 562921
It may depend on the version of MySQL, because optimizer code is improved in each release. This might be a case where an older version does more work for the left outer join, resulting in a table-scan of person
even if it would be more efficient to find a specific address
and then do the join in the reverse direction.
@spencer7593 demonstrates a case where the two join types result in a different table ordering by the optimizer, which means that a left join forces the left table to be accessed first. (though in his example, the "using join buffer" indicates that there's no index for the join, so this could be an anomaly.)
But I have seen cases where the optimizer detects that the query is equivalent to an inner join, because you have conditions in the WHERE clause on the "outer" table. So it produces exactly the same optimization plan for a left outer join as it does for an inner join, and allows for table re-ordering.
Upvotes: 3
Reputation: 108510
As demonstrated below, there is a potential for different execution plans for these two queries:
SELECT p.*, s.*
FROM p
LEFT
JOIN s ON s.col = p.col
WHERE s.col IS NOT NULL
SELECT p.*, s.*
FROM p
INNER
JOIN s ON s.col = p.col
id select_type table type poss key key_len ref rows Extra
-- ----------- ------ ---- ---- ---- ------- ----- ---- --------
1 SIMPLE p ALL - - - - 3
1 SIMPLE s ref s_ix s_ix 9 p.col 1
id select_type table type poss key key_len ref rows Extra
-- ----------- ------ ---- ---- ---- ------- ----- ---- -----------------------------
1 SIMPLE s ALL s_ix - - - 2
1 SIMPLE p ALL p_ix - - - 3 Using where; Using join buffer
So, we have to conclude that there could be a difference in performance. On small sets, the difference is going to be negligible. It's possible that large sets will show a significant difference in performance; we'd expect the INNER JOIN to be more efficient. It's entirely possible there's a test case that demonstrates better performance of the LEFT JOIN, but I haven't found it.
Upvotes: 5
Reputation: 25862
generally LEFT JOIN
will take longer than an INNER JOIN
because a LEFT JOIN
is not only doing the comparisons that an INNER JOIN
is doing but also doing the comparisons for extra rows that do not match (i.e. null
rows). so unless there is data that you want that will get filtered out by an INNER JOIN
... I would recommend using an INNER JOIN
Upvotes: 2