Bob Dem
Bob Dem

Reputation: 1031

Converting a PostgreSQL query with IN / NOT IN to JOINs

I currently have two tables with the same structure. Table A (as an example) has 10,000 rows, table B has 100,000 rows. I need to obtain the rows that are in Table B that are not in Table A, but only if certain fields are the same (and one is not).

Right now, the query is something like:

select *
from tableA A
where (A.field1, A.field2) in (select field1, field2 from tableB B)
  and A.field3 not in (select field3 from B)

This works, but probably a better performant solution could be done with JOINs. I have tried to do it but all I get is a very huge list of duplicated rows. Could someone point me in the right direction?

Upvotes: 0

Views: 144

Answers (2)

Avitus
Avitus

Reputation: 15958

Based on your current query this is what it translates to as joins:

select * 
from tableA A
inner join tableB B on A.field1 = B.field1 and A.field2 = B.field2
left outer join tableB C on A.field3 = C.field3
where c.field3 is null

A faster query would be:

    select A.pk
    from tableA A
    inner join tableB B on A.field1 = B.field1 and A.field2 = B.field2
    left outer join tableB C on A.field3 = C.field3
    where c.field3 is null
    group by A.pk

This would give you the rows you need to add to tableB because they aren't found.

Or you can just get the fields you want to pull over:

    select A.field1, A.field2, A.field3
    from tableA A
    inner join tableB B on A.field1 = B.field1 and A.field2 = B.field2
    left outer join tableB C on A.field3 = C.field3
    where c.field3 is null
    group by A.field1, A.field2, A.field3

Upvotes: 2

joop
joop

Reputation: 4503

[NOT] EXISTS is your friend:


SELECT *
FROM tableA A
WHERE EXISTS ( SELECT * FROM tableB B
    WHERE A.field1 = B.field1
    AND A.field2 = B.field2
    )
AND NOT EXISTS ( SELECT * FROM tableB B 
    WHERE  A.field3 = B.field3 
    );

Note: if the joined columns are NOT NULLable, the [NOT] EXISTS() version will behave exactly the same as the [NOT] IN version


Reading the question text again (and again):

I need to obtain the rows that are in Table B that are not in Table A, but only if certain fields are the same (and one is not).


SELECT *
FROM tableB B
WHERE EXISTS ( SELECT * FROM tableA A
        WHERE A.field1 = B.field1
        AND A.field2 = B.field2 
        AND  A.field3 <> B.field3
        );

Upvotes: 0

Related Questions