Reputation: 267020
Say I have about 10-30 rows returned from my query:
select * from users where location=10;
Is there any difference between the following:
select *
from users u
inner join location l on u.location = l.id
where location=10;
versus:
select * from users where location=10; # say this returns 1,2,3,4,5
select * from location where id IN (1,2,3,4,5)
Basically I want to know if there are any performance differences between doing an inner join versus doing a WHERE IN clause.
Upvotes: 0
Views: 4232
Reputation: 196
One way to compare the performance of different queries is to use postgresql's EXPLAIN command, for instance:
EXPLAIN select *
from users u inner join
location l
on u.location = l.id
where u.location = 10;
Which will tell you how the database will get the data. Watch for things like sequential scans, which indicate you could benefit from an index. It also gives estimates on the cost of each operation and how many rows it might be operating on. Some operations can yield a lot more rows than you would expect, which the database then reduces to the set it returns to you.
You can also use EXPLAIN ANALYZE [query], which will actually run the query and give you timing information.
See the postgresql documentation for more information.
Upvotes: 1
Reputation: 1269873
Is there a difference between issuing one query and issuing two queries? Well, I certainly hope so. The SQL engine is doing work, and it does twice as much work (from a certain perspective) for two queries.
In general, parsing a single query is going to be faster than parsing one query, returning an intermediate result set, and then feeding it back to another query. There is overhead in query compilation and in passing data back and forth.
For this query:
select *
from users u inner join
location l
on u.location = l.id
where u.location = 10;
You want an index on users(location)
and location(id)
.
I do want to point something else out. The queries are not equivalent. The real comparison query is:
select l.*
from location l
where l.id = 10;
You are using the same column for the where
and the on
. Hence, this would be the most efficient version and you want an index on location(id)
.
Upvotes: 1