Postgres remove composite duplicates

Question

In my populated database I have a table originally with this schema:

CREATE TABLE tale_references (
    id SERIAL PRIMARY KEY,
    creation_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    tale_id TEXT REFERENCES tale_catalog(tale_id) ON DELETE CASCADE,
    reference_id INT REFERENCES FT_References(id) ON DELETE CASCADE,
    tale_catalog_id INT REFERENCES tale_catalog(id) ON DELETE CASCADE,
    other_info JSONB);

I need to add to it the following line:

CONSTRAINT no_duplicate_tale UNIQUE (tale_catalog_id, reference_id))

However, the data already in the database has entries that break this constraint. How can I find these entries so that I can remove them?

Gordon Linoff · Accepted Answer

I would do this as:

delete from tale_references t
    where exists (select 1
                  from tale_references t2
                  where t2.tale_catalog_id = t.tale_catalog_id and
                        t2.reference_id = t.reference_id and
                        t2.id < t.id
                 );

This deletes any row where the tale_catalog_id/reference_id pair has a smaller value. In other words, it keeps the smallest value for each pair.

Postgres remove composite duplicates

Answers (2)

Related Questions