Scott Davies
Scott Davies

Reputation: 103

Invalid results from SQL Server "NOT IN" clause

I have run a query on our SQL Server 2012 which returned no results. I discovered that this was incorrect and I SHOULD have gotten 16 records. I changed the query and get the answer expected but I am at a loss to understand why my original query did not work as expected.

So my ORIGINAL query which returned no results was:

SELECT
    WPB.[ID number]
FROM
    [Fact].[REPORT].[WPB_LIST_OF_IDS] WPB
WHERE
    [ID number] NOT IN (SELECT DISTINCT IdNumber 
                        FROM MasterData.Dimension.Customer DC)

The reworked query is this:

SELECT
    WPB.[ID number]
FROM
    [Fact].[REPORT].[WPB_LIST_OF_IDS] WPB
LEFT JOIN
    MasterData.Dimension.Customer DC ON WPB.[ID number] = DC.IdNumber
WHERE
    DC.IdNumber IS NULL

Can anyone tell me WHY the first query (which incidentally runs in fractions of a second vs the 2nd which takes a minute) does not work? I don't want to repeat this mistake in the future!

Upvotes: 1

Views: 1370

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269763

Don't use not in with a subquery. It doesn't work the way you expect with NULL values. If any value returned by the subquery is NULL, then no rows are returned at all.

Instead, use not exists. This has the semantics that you expect:

select wpb.[ID number]
from [Fact].[REPORT].[WPB_LIST_OF_IDS] wpb
where not exists (select 1
                  from MasterData.Dimension.Customer dc
                  where wpb.[ID number] = dc.IdNumber
                 );

Of course, the left join method also works.

Upvotes: 5

Related Questions