Reputation: 9291
I have a query that is behaving in ways I would otherwise not expect.
I have two tables, stagin_users
and users
. In both tables I have a column called name
. In the users
table, EVERY value for name
is NULL
. In staging_users
I have 13 rows that do not have a NULL value. I am trying to run a query where I get all users in the staging table whose name is not in the users
table.
My query as written is:
SELECT name
FROM staging_users
WHERE name NOT IN (SELECT name FROM users);
As the query is written, I get NO results back. What is the reason for this behavior?
As the users
table only has NULL values I know I could say WHERE name IS NOT NULL
and I would get the same results, but I want this query to work against the values in the table, which all happen to be NULL
.
Upvotes: 0
Views: 153
Reputation: 13110
From the docs:
To comply with the SQL standard, IN returns NULL not only if the expression on the left hand side is NULL, but also if no match is found in the list and one of the expressions in the list is NULL.
And
expr NOT IN (value,...)
is the same asNOT (expr IN (value,...))
.
Thus as you are SELECT
ing NULL
values.. NOT IN
returns NULL
.. so no rows match.
You could rectify as so:
SELECT name
FROM staging_users
WHERE name IS NOT NULL
AND name NOT IN (
SELECT name
FROM users
WHERE name IS NOT NULL
);
Or, same logic:
SELECT su.name
FROM staging_users su
LEFT JOIN users u
ON u.name = su.name
AND su.name IS NOT NULL
AND u.name IS NOT NULL;
As an extra note, I would seriously question a data structure that allows users to have NULL names.. your original query will work if this is changed.
Upvotes: 3
Reputation: 21
NOT EXISTS is usually a better option for this type of query. NOT IN tends to be inherently unsafe when working with potentially NULL values. See the following link for more details.
Upvotes: 1