Sequenzia
Sequenzia

Reputation: 2381

SQL Queries SELECT IN and SELECT NOT IN

Does anyone know why the results of the following 2 queries do not add up to the results of the 3rd one?

SELECT COUNT(leadID) FROM leads 
WHERE makeID NOT IN (SELECT uploadDataMapID FROM DG_App.dbo.uploadData WHERE uploadID = 3 AND uploadRowID = 1)
AND modelID NOT IN (SELECT uploadDataMapID FROM DG_App.dbo.uploadData WHERE uploadID = 3 AND uploadRowID = 2)

SELECT COUNT(leadID) FROM Leads 
WHERE makeID IN (SELECT uploadDataMapID FROM DG_App.dbo.uploadData WHERE uploadID = 3 AND uploadRowID = 1)
OR modelID IN (SELECT uploadDataMapID FROM DG_App.dbo.uploadData WHERE uploadID = 3 AND uploadRowID = 2)

SELECT COUNT(leadID) FROM Leads

The first query is the count I need. The second one is to tell the user how many records were suppressed based on the contents of the DG_App.dbo.uploadData table. The third query is just a straight count of all the records.

When I run these the results of query 1 + the results of query 2 comes up about 46K records less than the count of the entire table. I have played with grouping the WHERE statements with () but that did not change the counts at all.

This is MSSQL Server 2012.

Any input on this would be great.

Thanks

Upvotes: 0

Views: 836

Answers (3)

fthiella
fthiella

Reputation: 49089

Do you have any NULL values? They won't appear on the first neither on the second count, i think you are missing those.

SELECT Count(leadID) FROM Leads WHERE makeID is Null or modelID is null

Edit: I have to add some more considerations: since there's an AND and an OR condition things gets a little more complicated!

  • Null IN () is always evaluated as Null, rows won't appear
  • Null NOT IN () is always evaluated as Null, rows won't appear
  • Null AND Something is evaluated as Null or as False, rows won't appear
  • Null OR Something is evaluated as True if Something is True, otherwise is evaluated as Null, only some rows will appear

This means that in your first two queries, some records whith Nulls are counted already, while others are not. This is what you miss:

  • You miss records where both makeID and modelID are Null:

    SELECT Count(leadID) FROM Leads WHERE makeID is Null and modelID is null
    
  • But you also miss record where makeID is Null, and modelID is not null and is not in second subquery:

    SELECT Count(leadID) FROM Leads
    WHERE makeID is Null
    AND modelID NOT IN (...)
    
  • And also those where modelID is null and makeID is not in first subquery:

    SELECT Count(leadID) FROM Leads
    WHERE makeID is NOT IN (...)
    AND modelID is Null
    

Upvotes: 4

Nick Clark
Nick Clark

Reputation: 21

Three cases I can think of:

  • An uploadDataMapID has an uploadID that does not equal 3
  • An uploadDataMapID has an uploadRowID that does not equal 1 or 2
  • There are NULL makeID or modelID entries in your leads table

SELECT count(leadID) FROM leads WHERE makeID IS NULL OR modelID IS NULL

SELECT count(leadID) 
FROM leads AS l INNER JOIN DG_App.dbo.uploadData AS u ON l.makeID = u.uploadDataMapID
WHERE u.uploadID <> 3 OR u.uploadRowID <> 1

SELECT count(leadID)
FROM leads AS l INNER JOIN DG_App.dbo.uploadData AS u ON l.modelID = u.uploadDataMapID
WHERE u.uploadID <> 3 OR u.uploadRowID <> 2

These three queries should give you the missing records.

Upvotes: 2

RichardTheKiwi
RichardTheKiwi

Reputation: 107786

Here's some explanation.

-- sample tables to discuss
create table innotintest (a int);
insert into innotintest values (1),(2),(3),(4),(null);
create table other (b int);
insert into other values (2),(3);
create table other1 (c int);
insert into other1 values (2),(3),(null);

select * from innotintest where a not in (select b from other); -- RESULT = 1,4
select * from innotintest where a in (select b from other);     -- RESULT = 2,3
--Overall outcome : Record with a=NULL is missing. It is neither IN nor NOT IN

select * from innotintest where a not in (select c from other1); -- RESULT = (none!)
select * from innotintest where a in (select c from other1);     -- RESULT = 2,3
--Overall outcome : Record with a=NULL is missing
--                  Moreover, when other1 contains NULL values,
--                            NOTHING is returned for NOT IN

To reiterate the comments in the code

  1. NULL is neither IN() nor NOT IN()
  2. When NOT IN(...) contains NULLs in the set, nothing will match against the filter

Upvotes: 1

Related Questions