user2875230
user2875230

Reputation: 339

Delete Rows with duplicate fields in Postgres

I have a database (table name is campaigns_eventfire) with duplicate events for contacts and I'd like to delete all duplicates so each contact id has just one event.

Here's what my table looks like:

ID CONTACT_ID EVENT_ID SCHEDULED
1 8,854 1,187 02/08/2 06:00
2 8,854 1,187 02/08/2 06:00
3 8,854 1,187 02/08/2 06:00
4 8,854 1,187 02/08/2 06:00
5 8,854 1,187 02/08/2 06:00
6 8,854 1,187 02/08/2 06:00
7 8,854 1,187 02/08/2 06:00
8 8,854 1,187 02/08/2 06:00
9 8,854 1,187 02/08/2 06:00
10 8,854 1,187 02/08/2 06:00
11 8,854 1,187 02/08/2 06:00
12 8,854 1,187 02/08/2 06:00
13 9,124 1,145 02/08/2 06:00
14 9,124 1,145 02/08/2 06:00
15 10,570 924 02/08/2 06:00
16 10,570 924 02/08/2 06:00

Contact_ID 8854 with event_id 1187 appears twelve times which means that the contact is scheduled for the same event 12 times which is wrong.

Contact_ID 9124 with event_id 1145 appears two times.

Contact_ID 10,570 with event_id 924 appears two times

Here's what I want the table to look like after deletion.

ID CONTACT_ID EVENT_ID SCHEDULED
1 8,854 1,187 02/08/2 06:00
13 9,124 1,145 02/08/2 06:00
15 10,570 924 02/08/2 06:00

It doesn't matter which ID gets deleted, so long as the duplicates are removed. I have around 600 affected records so deleting one at a time is just not practical.

Thanks.

Upvotes: 1

Views: 1409

Answers (2)

Jim Jones
Jim Jones

Reputation: 19613

It looks like a job for DELETE USING.

DELETE FROM t c1 USING t c2
WHERE c1.contract_id = c2.contract_id AND
      c1.id > c2.id;

Demo: db<>fiddle

Further reading: How To Delete Duplicate Rows in PostgreSQL

Upvotes: 2

Gordon Linoff
Gordon Linoff

Reputation: 1269513

Aggregation does what you want as a query:

SELECT MIN(ID), CONTACT_ID, EVENT_ID, SCHEDULED
FROM T
GROUP BY CONTACT_ID, EVENT_ID, SCHEDULED;

If you want to delete the rows, then you can use a correlated subquery:

delete from t
    where t.id > (select min(t2.id)
                  from t t2
                  where t2.contact_id = t.contact_id and
                        t2.event_id = t.event_id and
                        t2.scheduled = t.scheduled
                 );

Upvotes: 2

Related Questions