sfinks_29
sfinks_29

Reputation: 918

Performance: LEFT JOIN vs SUBQUERY

I'm using PostgreSQL 9.3 and have the following tables (simplified to only show the relevant fields):

SITES:
id
name
...

DEVICES:
id
site_id
mac_address UNIQUE
...

Given the mac_address of a particular device, and I want to get the details of the associated site. I have the following two queries:

Using LEFT JOIN:

SELECT s.* FROM sites s
LEFT JOIN devices d ON s.id = d.site_id
WHERE d.mac_address = '00:00:00:00:00:00';

Using SUBQUERY:

SELECT s.* FROM sites s
WHERE s.id IN (SELECT d.site_id FROM devices d WHERE d.mac_address = '00:00:00:00:00:00');

Which of the two queries would have the best performance over an infinitely growing database? I have always leaned towards the LEFT JOIN option, but would be interested to know how the performance of both rates on a large data set.

Upvotes: 16

Views: 18667

Answers (1)

Craig Ringer
Craig Ringer

Reputation: 325081

It generally won't make any difference, because they should result in the same query plan. At least, an EXISTS subquery will; IN isn't as always as intelligently optimised.

For the subquery, rather than using IN (...) you should generally prefer EXISTS (...).

SELECT s.*
FROM sites s
WHERE EXISTS (
  SELECT 1
  FROM devices d
  WHERE d.mac_address = '00:00:00:00:00:00'
    AND d.site_id = s.id
);

Upvotes: 19

Related Questions