Reputation: 68658
I'm just wondering if all of the following joins are logically equivalent, and if not, why not?
SELECT t1.x, t2.y from t1, t2 where t1.a=t2.a and t1.b=t2.b and t1.c = t2.c;
SELECT t1.x, t2.y from t1 join t2 on t1.a=t2.a where t1.b=t2.b and t1.c = t2.c;
SELECT t1.x, t2.y from t1 join t2 on t1.a=t2.a and t1.b=t2.b where t1.c = t2.c;
SELECT t1.x, t2.y from t1 join t2 on t1.a=t2.a and t1.b=t2.b and t1.c = t2.c;
I guess my real question is: does combining "where" with "on" doing something different from just having multiple conditions ANDed together with "on"?
I work with MySQL, in case that makes a difference.
Upvotes: 5
Views: 620
Reputation: 28204
For INNER JOIN it makes no logical difference and optimizer should produce same plans. But for OUTER joins it becomes important whether you put condition in WHERE or FROM ... JOIN clause. This is because FROM and ON clauses are processed before WHERE clause: ANSI SQL logical query processing http://www.sqlmag.com/Files/09/94378/Figure_01.jpg
Upvotes: 2
Reputation: 5078
They are logically equivalent. However, where you define the join conditions makes a difference as to how many records are used in the temporary table on which the where clause is applied. That is,
If table t1, t2 and t3 had 10 records each, the statement,
SELECT t1.x, t2.y from t1, t2 where t1.a=t2.a and t1.b=t2.b and t1.c = t2.c;
results in 1000 records of a permutation of the three tables records and then the where clause is applied.
For
SELECT t1.x, t2.y from t1 join t2 on t1.a=t2.a and t1.b=t2.b and t1.c = t2.c;
only ten records are in the temporary table before any where clause (none in this case) is applied. The second method is much faster when working with large tables.
Upvotes: 1
Reputation: 562891
Yes, as others have stated, the result is the same from all these queries.
FWIW, you can also use this shorthand syntax when you're doing an equi-join on column names that are the same in both tables:
SELECT t1.x, t2.y from t1 join t2 using (a, b, c);
As far as optimization, it should be optimized the same. That is, the RDBMS should be smart enough to analyze the WHERE
syntax the same, and perform joins instead of generating an intermediate huge cross-join result and applying filtering conditions to it. This is such a common type of query, that it's also common for a given RDBMS implementation to recognize and optimize it.
In the case of MySQL, join and where are (kind of) evaluated together. Try using EXPLAIN
to analyze your query. If the "type
" column indicates "eq_ref
" it means it's using an indexed join. This is the best type of join with respect to optimization. If "type
" is "ref
" it's good too.
You can get these join optimization types whether you put the condition in the JOIN...ON
clause or the WHERE
clause.
Upvotes: 2
Reputation: 132710
They are logically equivalent and should produce the same result. However, the last one is to be preferred as it states more correctly the semantics of the query - i.e. "join tables t1 and t2".
The WHERE clause should be used for "filtering" results of the join - e.g.
... WHERE t2.some_col > 10
Also, as Constantin has said in another answer, the 4 queries would be different if the join was an OUTER join.
Upvotes: 5