Will Harrison
Will Harrison

Reputation: 845

I hit a weird issue running a query against a PostgreSQL Database using NOT IN and don't understand why it didn't work

I have a hierarchical table with an id and parent_id column, the parent_id has a foreign key to the id column. Each row can only have one parent, but multiple rows can be tied to the same parent. I wanted to retrieve all the rows that didn't have any children.

I attempted this with the following query

SELECT *
FROM table
WHERE id NOT IN (
    SELECT DISTINCT(parent_id)
    FROM table
)

This returned 0 rows. If i change the NOT IN to IN it correctly returns the rows that have children (other rows tie to it through their parent_id)

I ended up getting this working:

SELECT *
FROM table
WHERE id NOT IN(
    SELECT id
    FROM table
    WHERE id IN (
        SELECT DISTINCT(parent_id)
        FROM table
    )
)

But I don't understand why the first query didn't work? Can anybody help me to understand what's going on here? Do I not understand how NOT IN should work?

Upvotes: 1

Views: 73

Answers (1)

Abelisto
Abelisto

Reputation: 15624

Try

SELECT *
FROM table
WHERE id NOT IN (
    SELECT DISTINCT(parent_id)
    FROM table
    WHERE parent_id IS NOT NULL
)

For example:

with t(x) as (values(1),(2))
select 3
where 3 not in (select x from t);

┌──────────┐
│ ?column? │
├──────────┤
│        3 │
└──────────┘

but

with t(x) as (values(1),(2),(null))
select 3
where 3 not in (select x from t);

┌──────────┐
│ ?column? │
├──────────┤
└──────────┘

It is because the DBMS can't made decision is id = null or not (the result is undefined)

You could to fix it as mentioned above or using not exists:

with t(x) as (values(1),(2),(null))
select 3
where not exists (select x from t where x = 3);

┌──────────┐
│ ?column? │
├──────────┤
│        3 │
└──────────┘

Upvotes: 1

Related Questions