Reputation: 918
I'm using PostgreSQL 9.3 and have the following tables (simplified to only show the relevant fields):
SITES:
id
name
...
DEVICES:
id
site_id
mac_address UNIQUE
...
Given the mac_address
of a particular device, and I want to get the details of the associated site
. I have the following two queries:
Using LEFT JOIN:
SELECT s.* FROM sites s
LEFT JOIN devices d ON s.id = d.site_id
WHERE d.mac_address = '00:00:00:00:00:00';
Using SUBQUERY:
SELECT s.* FROM sites s
WHERE s.id IN (SELECT d.site_id FROM devices d WHERE d.mac_address = '00:00:00:00:00:00');
Which of the two queries would have the best performance over an infinitely growing database? I have always leaned towards the LEFT JOIN
option, but would be interested to know how the performance of both rates on a large data set.
Upvotes: 16
Views: 18667
Reputation: 325081
It generally won't make any difference, because they should result in the same query plan. At least, an EXISTS
subquery will; IN
isn't as always as intelligently optimised.
For the subquery, rather than using IN (...)
you should generally prefer EXISTS (...)
.
SELECT s.*
FROM sites s
WHERE EXISTS (
SELECT 1
FROM devices d
WHERE d.mac_address = '00:00:00:00:00:00'
AND d.site_id = s.id
);
Upvotes: 19