Reputation: 103
I have run a query on our SQL Server 2012 which returned no results. I discovered that this was incorrect and I SHOULD have gotten 16 records. I changed the query and get the answer expected but I am at a loss to understand why my original query did not work as expected.
So my ORIGINAL query which returned no results was:
SELECT
WPB.[ID number]
FROM
[Fact].[REPORT].[WPB_LIST_OF_IDS] WPB
WHERE
[ID number] NOT IN (SELECT DISTINCT IdNumber
FROM MasterData.Dimension.Customer DC)
The reworked query is this:
SELECT
WPB.[ID number]
FROM
[Fact].[REPORT].[WPB_LIST_OF_IDS] WPB
LEFT JOIN
MasterData.Dimension.Customer DC ON WPB.[ID number] = DC.IdNumber
WHERE
DC.IdNumber IS NULL
Can anyone tell me WHY the first query (which incidentally runs in fractions of a second vs the 2nd which takes a minute) does not work? I don't want to repeat this mistake in the future!
Upvotes: 1
Views: 1370
Reputation: 1269763
Don't use not in
with a subquery. It doesn't work the way you expect with NULL
values. If any value returned by the subquery is NULL
, then no rows are returned at all.
Instead, use not exists
. This has the semantics that you expect:
select wpb.[ID number]
from [Fact].[REPORT].[WPB_LIST_OF_IDS] wpb
where not exists (select 1
from MasterData.Dimension.Customer dc
where wpb.[ID number] = dc.IdNumber
);
Of course, the left join
method also works.
Upvotes: 5