Reputation: 16281
I was actually trying some thing similar with a sub query, and couldn’t understand what was going wrong. I have managed to simplify the question to the following.
I have a simple table which may include a NULL
in one of the columns:
DROP TABLE IF EXISTS data;
CREATE TEMP TABLE data (
id INT PRIMARY KEY,
number INT,
string TEXT
);
INSERT INTO data(id,number,string)
VALUES (1,1,'Apple'),(2,1,'Accordion'),(3,2,'Banana'),(4,2,'Banjo'),(5,NULL,'Whatever');
SELECT * FROM data WHERE number IN(1,2,NULL);
SELECT * FROM data WHERE number NOT IN(1,2,NULL);
There is a live version at https://dbfiddle.uk/KhTzbX_E .
When I look for rows matching the number
column:
SELECT * FROM data WHERE number IN(1,2,NULL);
I get a few results, as expected. This doesn’t include the row where number
is NULL
, but I suppose that the IN
expression is short for WHERE a = b
.
If I look for the non-matches:
SELECT * FROM data WHERE number NOT IN(1,2,NULL);
I get nothing at all.
I can’t see how that can be right. The expression IN(1,2,NULL)
must return a valid list otherwise the first one wouldn’t work.
What is going on here, and is there a correct way to do this?
Note: I know it’s silly to put in the NULL
, but the idea is that the list is supposed to be a sub query which might return a vew NULL
s. I also know that I can filter out the NULL
s in the sub query. However that looks like a workaround to me.
I have tried this in PostgreSQL, MariaDB and Microsoft SQL Server.
Upvotes: 0
Views: 111
Reputation: 656231
Basically, NOT IN
is of very limited use if any null
values are involved. This isn't exactly a new insight. See:
nr NOT IN (1,2,NULL)
translates to:
NOT (nr = 1 OR nr = 2 or nr = null)
nr = null
is always null
, and since false OR null
results in null
, the expression can never become true
. Used in the WHERE
clause, such a filter can never return any rows.
Upvotes: 2
Reputation: 246033
This is an old trap for people who are not experienced with NULL: if a NOT IN
list contains NULL, the result set is always empty.
To understand why, let's rewrite
WHERE number NOT IN (1, 2, NULL)
to the semantically equivalent
WHERE number <> 1 AND number <> 2 AND number <> NULL
Then note that number <> NULL
will always return NULL. This is easiest to understand if you think of “NULL” as “unknown”: when asked if an unknown number is different from any given number, the answer could be “true” or “false”, depending on the unknown value of the unknown number. So the answer must be “unknown”, which is the boolean
value NULL.
Now something AND NULL
can be FALSE
(if something
is FALSE
) or NULL (if something
is NULL or TRUE
), but it never can be TRUE
. And a WHERE
condition only passes rows where the condition is TRUE
. Both FALSE
and NULL won't pass.
Upvotes: 2
Reputation: 35553
IN()
is a shortcut. number IN(1,2,NULL)
equates to:
(number = 1 or number = 2 or number = NULL)
you cannot use equality against NULL you must use IS NULL
instead, hence the IN shortcut won't work as you expected with with NULLs.
As you suggested, it is recommended that you strip out any NULLs if using a subquery.
Upvotes: 0
Reputation: 8314
Any equality check involving NULL will return null, and null does not equal null. Null needs special language using is/is not
My recommendation would be to add another part to your where clause to make it work, and also keep it readable:
SELECT * FROM data WHERE (number NOT IN(1,2) and number is not null);
SELECT * FROM data WHERE (number IN (1,2) or number is null);
Try playing around with null and you'll see how it behaves:
select case when null = null then 1 else 0 end as test
Upvotes: 0