Reputation: 243
How to find records with duplicate values in any column using Activerecord or SQL?
SELECT leads.id, leads.name, leads.email, leads.created_at, array_agg(tn2.id) as ids
FROM "leads" join leads tn2
on leads.name = tn2.name
or leads.cpf_cnpj = tn2.cpf_cnpj
or leads.email = tn2.email
or leads.phone -> 'cellphone' = tn2.phone -> 'cellphone'
or leads.phone -> 'residence' = tn2.phone -> 'residence'
or leads.phone -> 'commercial' = tn2.phone -> 'commercial'
GROUP BY leads.id ORDER BY leads.created_at DESC
Using array_agg
I want only ids from repeated objects, but it gives me from all records.
Currently, I'm using PostgreSQL.
Upvotes: 1
Views: 232
Reputation: 656331
How to find records with duplicate values in any column?
SELECT l.id, l.name, l.email, l.created_at, array_agg(l2.id) AS ids
FROM leads l
WHERE EXISTS (
SELECT 1
FROM leads
WHERE id <> l.id
AND (
name = l.name
OR cpf_cnpj = l.cpf_cnpj
OR email = l.email
OR phone->'cellphone' = l.phone->'cellphone'
OR phone->'residence' = l.phone->'residence'
OR phone->'commercial' = l.phone->'commercial'
)
);
But it seems like you want something different:
How to get an array of IDs for each row from rows that have the same value in at least one of several given columns, youngest entry first?
SELECT l.id, l.name, l.email, l.created_at
, array_agg(l2.id ORDER BY l2.created_at DESC NULL LAST) AS dupe_ids
FROM leads l
JOIN leads l2 ON l2.id <> l.id
AND (
l2.name = l.name
OR l2.cpf_cnpj = l.cpf_cnpj
OR l2.email = l.email
OR l2.phone->'cellphone' = l.phone->'cellphone'
OR l2.phone->'residence' = l.phone->'residence'
OR l2.phone->'commercial' = l.phone->'commercial'
)
GROUP BY l.id
ORDER BY l.created_at DESC NULL LAST;
Assuming id
is the primary key.
Upvotes: 1