David Morales
David Morales

Reputation: 18064

How can I get records from one table which do not exist in a related table?

I have this users table:

users

and this relationships table:

enter image description here

So each user is paired with another one in the relationships table.

Now I want to get a list of users which are not in the relationships table, in either of the two columns (user_id or pair_id).

How could I write that query?

First try:

SELECT users.id
FROM users
LEFT OUTER JOIN relationships
ON users.id = relationships.user_id
WHERE relationships.user_id IS NULL;

Output:

enter image description here

This is should display only 2 results: 5 and 6. The result 8 is not correct, as it already exists in relationships. Of course I'm aware that the query is not correct, how can I fix it?

Upvotes: 1

Views: 145

Answers (4)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 659347

This is a special case of:
Select rows which are not present in other table

I suppose this will be simplest and fastest:

SELECT u.id
FROM   users u
WHERE  NOT EXISTS (
   SELECT 1
   FROM   relationships r
   WHERE  u.id IN (r.user_id, r.pair_id)
   );

In Postgres, u.id IN (r.user_id, r.pair_id) is just short for:(u.id = r.user_id OR u.id = r.pair_id).
The expression is transformed that way internally, which can be observed from EXPLAIN ANALYZE.

To clear up speculations in the comments: Modern versions of Postgres are going to use matching indexes on user_id, and / or pair_id with this sort of query.

Upvotes: 2

Glenn
Glenn

Reputation: 9170

I like the set operators

select id from users
  except
select user_id from relationships
  except
select pair_id from relationships

or

select id from users
  except
(select user_id from relationships
   union
 select pair_id from relationships
)

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1271231

You need to compare to both values in the on statement:

SELECT u.id
FROM users u LEFT OUTER JOIN
     relationships r
     ON u.id = r.user_id or u.id = r.pair_id
WHERE r.user_id IS NULL;

In general, or in an on clause can be inefficient. I would recommend replacing this with two not exists statements:

SELECT u.id
FROM users u
WHERE NOT EXISTS (SELECT 1 FROM relationships r WHERE u.id = r.user_id) AND
      NOT EXISTS (SELECT 1 FROM relationships r WHERE u.id = r.pair_id);

Upvotes: 4

Alan Hay
Alan Hay

Reputation: 23246

Something like:

select u.id
from users u
where u.id not in (select r.user_id from relationships r) 
and u.id not in (select r.pair_id from relationships r) 

Upvotes: 0

Related Questions