Reputation: 339
I have a database (table name is campaigns_eventfire) with duplicate events for contacts and I'd like to delete all duplicates so each contact id has just one event.
Here's what my table looks like:
ID | CONTACT_ID | EVENT_ID | SCHEDULED |
---|---|---|---|
1 | 8,854 | 1,187 | 02/08/2 06:00 |
2 | 8,854 | 1,187 | 02/08/2 06:00 |
3 | 8,854 | 1,187 | 02/08/2 06:00 |
4 | 8,854 | 1,187 | 02/08/2 06:00 |
5 | 8,854 | 1,187 | 02/08/2 06:00 |
6 | 8,854 | 1,187 | 02/08/2 06:00 |
7 | 8,854 | 1,187 | 02/08/2 06:00 |
8 | 8,854 | 1,187 | 02/08/2 06:00 |
9 | 8,854 | 1,187 | 02/08/2 06:00 |
10 | 8,854 | 1,187 | 02/08/2 06:00 |
11 | 8,854 | 1,187 | 02/08/2 06:00 |
12 | 8,854 | 1,187 | 02/08/2 06:00 |
13 | 9,124 | 1,145 | 02/08/2 06:00 |
14 | 9,124 | 1,145 | 02/08/2 06:00 |
15 | 10,570 | 924 | 02/08/2 06:00 |
16 | 10,570 | 924 | 02/08/2 06:00 |
Contact_ID 8854 with event_id 1187 appears twelve times which means that the contact is scheduled for the same event 12 times which is wrong.
Contact_ID 9124 with event_id 1145 appears two times.
Contact_ID 10,570 with event_id 924 appears two times
Here's what I want the table to look like after deletion.
ID | CONTACT_ID | EVENT_ID | SCHEDULED |
---|---|---|---|
1 | 8,854 | 1,187 | 02/08/2 06:00 |
13 | 9,124 | 1,145 | 02/08/2 06:00 |
15 | 10,570 | 924 | 02/08/2 06:00 |
It doesn't matter which ID gets deleted, so long as the duplicates are removed. I have around 600 affected records so deleting one at a time is just not practical.
Thanks.
Upvotes: 1
Views: 1409
Reputation: 19613
It looks like a job for DELETE USING
.
DELETE FROM t c1 USING t c2
WHERE c1.contract_id = c2.contract_id AND
c1.id > c2.id;
Demo: db<>fiddle
Further reading: How To Delete Duplicate Rows in PostgreSQL
Upvotes: 2
Reputation: 1269513
Aggregation does what you want as a query:
SELECT MIN(ID), CONTACT_ID, EVENT_ID, SCHEDULED
FROM T
GROUP BY CONTACT_ID, EVENT_ID, SCHEDULED;
If you want to delete the rows, then you can use a correlated subquery:
delete from t
where t.id > (select min(t2.id)
from t t2
where t2.contact_id = t.contact_id and
t2.event_id = t.event_id and
t2.scheduled = t.scheduled
);
Upvotes: 2