Reputation: 31
There's a similar question here, but my doubt is slight different:
select *
from process a inner join subprocess b on a.id=b.id and a.field=true
and b.field=true
So, when using inner join
, which operation comes first: the join or the a.field=true
condition?
As the two tables are very big, my goal is to filter table process
first and after that join only the rows filtered with table subprocess
.
Which is the best approach?
Upvotes: 2
Views: 9569
Reputation: 10046
First things first:
which operation comes first: the join or the a.field=true condition?
Your INNER JOIN
includes this (a.field=true
) as part of the condition for the join. So it will prevent rows from being added during the JOIN
process.
A part of an RDBMS is the "query optimizer" which will typically find the most efficient way to execute the query - there is no guarantee on the order of evaluation for the INNER JOIN
conditions.
Lastly, I would recommend rewriting your query this way:
SELECT *
FROM process AS a
INNER JOIN subprocess AS b ON a.id = b.id
WHERE a.field = true AND b.field = true
This will effectively do the same thing as your original query, but it is widely seen as much more readable by SQL programmers. The optimizer can rearrange INNER JOIN
and WHERE
predicates as it sees fit to do so.
Upvotes: 5
Reputation: 1300
You are thinking about SQL in terms of a procedural language which it is not. SQL is a declarative language, and the engine is free to pick the execution plan that works best for a given situation. So, there is no way to predict if a join or a where will be executed first.
A better way to think about SQL is in terms of optimizing queries. Things like assuring that your joins and wheres are covered by indexes. Also, at least in MS Sql Server, you can preview an estimated or actual execution plan. There is nothing stopping you from doing that and seeing for yourself.
Upvotes: 0