postgres select count distinct returning unexpected extra row

Question

If there is one more UID in sessions than there is in users (obviously not supposed to be that way), then I expect to have a non-empty result set when I run the last select, but I get no rows returned - this result just doesn't make logical sense to me...

select count(distinct(uid)) from users;

> 108736

select count(distinct(uid)) from sessions;

> 108737

select count(*) from sessions where uid not in (select uid from users);

> 0

and just for completeness:

select count(*) from users where uid not in (select uid from sessions);

> 0

I have checked for nulls:

select count( * ) from sessions where uid is null; 

> 0 

select count( * ) from users where uid is null;

> 14

The schema is defined in sqlalchemy and includes a foreign key in the session table:

uid = Column(Integer, ForeignKey('users.uid', use_alter=True, name='fk_uid'))

This schema is a static dump for analytics purposes so there is no chance of concurrency issues...

Hinni · Accepted Answer

Your third query does not do what you think it does.

The following query illustrates the problem:

SELECT 1 NOT IN (SELECT unnest(ARRAY[NULL]::int[]));

This returns NULL, because it can't say if 1 <> NULL. So, in your query the where condition is always NULL, because users contains a NULL uid.

I recommend using EXCEPT do find the culprit in your sessions table.

SELECT uid from sessions EXCEPT SELECT uid from users;

postgres select count distinct returning unexpected extra row

Answers (1)

Related Questions