Reputation: 381
If there is one more UID in sessions than there is in users (obviously not supposed to be that way), then I expect to have a non-empty result set when I run the last select, but I get no rows returned - this result just doesn't make logical sense to me...
select count(distinct(uid)) from users;
> 108736
select count(distinct(uid)) from sessions;
> 108737
select count(*) from sessions where uid not in (select uid from users);
> 0
and just for completeness:
select count(*) from users where uid not in (select uid from sessions);
> 0
I have checked for nulls:
select count( * ) from sessions where uid is null;
> 0
select count( * ) from users where uid is null;
> 14
The schema is defined in sqlalchemy and includes a foreign key in the session table:
uid = Column(Integer, ForeignKey('users.uid', use_alter=True, name='fk_uid'))
This schema is a static dump for analytics purposes so there is no chance of concurrency issues...
Upvotes: 1
Views: 184
Reputation: 166
Your third query does not do what you think it does.
The following query illustrates the problem:
SELECT 1 NOT IN (SELECT unnest(ARRAY[NULL]::int[]));
This returns NULL
, because it can't say if 1 <> NULL
.
So, in your query the where condition is always NULL
, because users
contains a NULL
uid.
I recommend using EXCEPT
do find the culprit in your sessions table.
SELECT uid from sessions EXCEPT SELECT uid from users;
Upvotes: 4