pehaada
pehaada

Reputation: 513

Excluding a Null value returns 0 rows in a sub query

I'm trying to clean up some data in SQL server and add a foreign key between the two tables.

I have a large quantity of orphaned rows in one of the tables that I would like to delete. I don't know why the following query would return 0 rows in MS SQL server.

--This Query returns no Rows

select * from tbl_A where ID not in ( select distinct ID from tbl_B )

When I include IS NOT NULL in the subquery I get the results that I expect.

-- Rows are returned that contain all of the records in tbl_A but Not in tbl_B

select * from tbl_A where ID not in ( select distinct ID from tbl_B where ID is not null )

The ID column is nullable and does contain null values. IF I run just the subquery I get the exact same results except the first query returns one extra NULL row as expected.

Upvotes: 1

Views: 5695

Answers (5)

Jaloopa
Jaloopa

Reputation: 750

You probably have ANSI NULLs switched off. This compares null values so null=null will return true.

Prefix the first query with

SET ANSI_NULLS ON
GO

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269773

If the ids are never negative, you might consider something like:

select *
from tbl_A
where coalesce(ID, -1) not in ( select distinct coalesce(ID, -1) from tbl_B )

(Or if id is a string, use something line coalesce(id, '<null>')).

This may not work in all cases, but it has the virtue of simplicity on the coding level.

Upvotes: 0

TimothyAWiseman
TimothyAWiseman

Reputation: 14873

The problem is the non-comparability of nulls. If you are asking "not in" and there are nulls in the subquery it cannot say that anything anything is definitely not in becuase it is looking at those nulls as "unknown" and so the answer is always "unknown" in the three value logic that SQL uses.

Now of course that is all assuming you have ANSI_NULLS ON (which is the default) If you turn that off then suddenly NULLS become comparable and it will give you results, and probably the results you expect.

Upvotes: 0

Luther
Luther

Reputation: 300

Matching on NULL with equals (=) will return NULL or UNKNOWN as opposed to true/false from a logic standpoint. E.g. see http://msdn.microsoft.com/en-us/library/aa196339(v=sql.80).aspx for discussion.

If you want to include finding NULL values in table A where there is no NULL in table B (if B is the "parent" and A is the "child" in the "foreign key" relationship you desire) then you would need a second statement, something like the following. Also I would recommend qualifying the ID field with a table prefix or alias since the field names are the same in both tables. Finally, I would not recommend having NULL values as the key. But in any case:

select * from tbl_A as A where (A.ID not in ( select distinct B.ID from tbl_B as B ))
  or (A.ID is NULL and not exists(select * from tbl_B as B where B.ID is null))

Upvotes: 1

Taryn
Taryn

Reputation: 247690

This is the expected behavior of the NOT IN subquery. When a subquery returns a single null value NOT IN will not match any rows.

If you don't exclusively want to do a null check, then you will want to use NOT EXISTS:

select * 
from tbl_A A
where not exists (select distinct ID 
                  from tbl_B b
                  where a.id = b.id)

As to why the NOT IN is causing issues, here are some posts that discuss it:

NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL

NOT EXISTS vs NOT IN

What's the difference between NOT EXISTS vs. NOT IN vs. LEFT JOIN WHERE IS NULL?

Upvotes: 2

Related Questions