Antoine Lefebvre
Antoine Lefebvre

Reputation: 376

Why is my SQL request much slower when the condition is in the join?

I have a table that contains the following columns:

  1. ID
  2. Master ID

The Master ID can be shared between different rows with different IDs.

E.g.:

ID | Master ID
1  |  1
2  |  1
3  |  1
4  |  2

Knowing the ID I want to retrieve all the rows that share the same master ID

I managed to do it using this query:

Select * 
FROM table t
LEFT JOIN table t2
ON t.MASTER_ID = t2.MASTER_ID 
Where t.ID = '1'

Then I also tried using:

Select * 
FROM table t
LEFT JOIN table t2
ON t.MASTER_ID = t2.MASTER_ID and t.ID = '1'

In that case, it was much slower. Can anyone explain why?

Upvotes: 0

Views: 83

Answers (4)

Marmite Bomber
Marmite Bomber

Reputation: 21075

You don't need an OUTER JOIN. The reason is simple, your are joining the same table on the same column - there can't be a non-match!

The query to use is therefore

 Select * 
 FROM T
 INNER  JOIN T t2    
 ON T.MasterID = t2.MasterID 
 Where t.ID = 1 

Upvotes: 0

GarethD
GarethD

Reputation: 69789

The queries are doing different things, the first you are saying:

1. give me all rows from `table` where `id = 1`
2. Also give me rows from t2 with a matching master ID

In the second you are saying

1. Give me all rows from `table`
2. Return rows from `t2` with a matching master ID and where `t1.ID = 1`

In a simple example you might have

ID  Master_ID
------------------------
1   1
2   1
3   1
4   2

So your first query will return:

t1.ID   t1.Master_ID    t2.ID   t2.Master_ID
--------------------------------------------
1           1           1       1
1           1           2       1
1           1           3       1

Your second query will return

t1.ID   t1.Master_ID    t2.ID   t2.Master_ID
--------------------------------------------
1           1           1       1
1           1           2       1
1           1           3       1
2           1           NULL    NULL
3           1           NULL    NULL
4           2           NULL    NULL

So basically in the first query you are returning a limited number of rows from your table, whereas in the second you return all rows, but only join to some of them.

Upvotes: 3

If the t.ID = '1' condition is in the WHERE clause the t.ID='1' condition only has to be evaluated for the number of rows in t. If the t.ID='1' condition is put into the ON clause for the join it must be evaluated for all rows in t2. If there are a lot of rows in t2 this can significantly increase the run time of the query.

Upvotes: 2

Rahul
Rahul

Reputation: 77896

You shouldn't include t.ID = '1' in JOIN ON condition since it's not the joined table. condition on the table in FROM part should stay in WHERE clause; whereas condition belongs to joined table should be moved to join on clause so to get a proper outer join effect rather a inner join effect.

Upvotes: 1

Related Questions