Reputation: 9839
Assume 2 tables t_left and t_right indexed on column val1.
They do have 100000 records each randomized on val1.
I have 2 options to extract records from t_left not present in t_right:
1.SELECT *
FROM t_left o
WHERE o.val1 not in (Select val1
FROM t_right h)
2.SELECT *
FROM t_left o
WHERE o.val1 not in (Select val1
FROM t_right h
where h.val1 = o.val1)
Surprisingly,I see the same execution plan for both approaches.
Are these just syntactic quirks;or they do have unique appropriate used-cases?
I am using sql server 2008.
Upvotes: 1
Views: 107
Reputation: 453950
They can return different results if t_right
contains NULL
.
NOT IN (x, y, NULL)
always returns an empty set but the equality predicate implicitly excludes any NULL
from t_right
.
Additionally they can return different results if t_left
contains NULL
and t_right
is non empty.
NULL NOT IN (...)
is never true except if (...)
is an empty set so any NULL
s in the left hand side are only preserved if t_right
is totally empty or by the second query.
If neither column is nullable both queries will have the same semantics and likely the same plan.
Upvotes: 2
Reputation: 1271161
The reason the two query plans are the same (at least at the level of the execution plan) is because the two queries have a certain equivalency.
Take the first query:
SELECT *
FROM t_left o
WHERE o.val1 not in (Select val1 FROM t_right h);
One way that SQL Server can execute this query is by using the index on t_right.val1
. In other words, the query gets executed as: "Look up each val1
in t_left
in the index on t_right.val1
. If you don't find it, then do accept the record."
The second query adds the correlation to this:
SELECT *
FROM t_left o
WHERE o.val1 not in (Select val1 FROM t_right h WHERE h.val1 = o.val1);
Guess what? Exactly the same execution approach can work for this query. The correlation just makes the index lookup more explicit.
By the way, there is another way to express this query:
SELECT *
FROM t_left o
WHERE not exists (Select val1 FROM t_right h WHERE h.val1 = o.val1);
My guess is that this would also have the same execution plan.
Martin's comment is very interesting, because the two queries are not the same, unless val1
is the primary key in t_right
.
Upvotes: 0