Reputation:
I try to do two query, in my mind it should give me the same result but no
SELECT count(1)
FROM users
WHERE users.id NOT IN (
SELECT user_id as id
FROM users_roles as ur
WHERE
ur.role_id = 10
OR ur.role_id = 12
)
i have the result:
count
-------
0
(1 row)
and second change first NOT IN
to IN
and wrap it to outer NOT IN
is
SELECT count(1)
FROM users
WHERE id NOT IN (
SELECT users.id
FROM users
WHERE users.id IN (
SELECT user_id as id
FROM users_roles as ur
WHERE
ur.role_id = 10
OR ur.role_id = 12
)
)
with result
count
---------
3150136
(1 row)
what is wrong with first query?
more details for second query:
SELECT count(1)
FROM users
WHERE users.id IN (
SELECT user_id as id
FROM users_roles as ur
WHERE
ur.role_id = 10
OR ur.role_id = 12
)
count
-------
40320
(1 row)
and
select count(1) from users;
count
---------
3190466
edit by user query:
database=# \d users_roles
Table "public.users_roles"
Column | Type | Modifiers
----------+---------+----------------------------------------------------------
user_id | integer |
role_id | integer |
track_id | integer |
id | integer | not null default nextval('users_roles_id_seq'::regclass)
Indexes:
"users_roles_pkey" PRIMARY KEY, btree (id)
"uniq_users_roles" UNIQUE CONSTRAINT, btree (user_id, role_id)
"uq_users_roles_role_track" UNIQUE CONSTRAINT, btree (role_id, track_id)
Foreign-key constraints:
"fk_roles_track_id" FOREIGN KEY (track_id) REFERENCES tracks(id)
"fk_users_roles_roles" FOREIGN KEY (role_id) REFERENCES roles(id)
"fk_users_roles_users" FOREIGN KEY (user_id) REFERENCES users(id)
and select
SELECT Count(user_id) as totalusers FROM users_roles as ur WHERE ur.role_id = 10 OR ur.role_id = 12;
totalusers
------------
40320
P.S.
database=# select version();
version
--------------------------------------------------------------------------------------------------------------
PostgreSQL 9.4.14 on x86_64-unknown-linux-gnu, compiled by gcc (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4, 64-bit
Upvotes: 4
Views: 211
Reputation: 1787
This could be because the table user_roles
might have NULL user_id
when role_id
= 10
or role_id
= 12
.
First query becomes something like:
SELECT COUNT(1)
FROM users
WHERE users.id NOT IN (id1, id2, id3, NULL, id4, NULL, id5,...)
Here a given users.id
will be compared with all values in the list and only if it is not equal to any id value in the above list, it will qualify to show up in the output.
With NULL values present in the list, the NOT IN
evaluates the condition for a given users.id
as [users.id
<>id1
AND users.id
<>id2
AND... users.id
<>NULL
AND users.id
<>id5
] which you would expect is TRUE but SQL treats it as FALSE because it evaluates users.id
<>NULL
as FALSE, hence evaluating the whole expression as FALSE and excludes that users.id
from the result. This happens all other users.id
in the users
table which is why you got 0 entries in the output.
FYI: id
= NULL
=> UNKNOWN as well as id
<> NULL
=> UNKNOWN where UNKNOWN is neither TRUE nor FALSE(but in the context of your problem you can still say not TRUE
when you were expecting it to be TRUE
) because NULL
is actually an "unknown" with no value at all.
To avoid this, you can remove NULLs from the inner list as given below:
SELECT count(1)
FROM users
WHERE users.id NOT IN (
SELECT user_id as id
FROM users_roles as ur
WHERE
(ur.role_id = 10 OR ur.role_id = 12) AND ur.user_id IS NOT NULL
)
This was with NOT IN
.
However, you won't face this problem when you use IN
because if there is a non-NULL users.id
which is a match in the list, then it will be included in the output which is why you got 40,320 records in the other query.
Upvotes: 3