Richard
Richard

Reputation: 8935

SQL to disable duplicate rows (Postgres)

I am using a Postgres database, and have the following table.

                  LOB
+-------+------+-----------+----------+
| lobid | name | companyid | disabled |
+-------+------+-----------+----------+
|   1   | abc  |    101    | false    |
+-------+------+-----------+----------+
|   2   | abc  |    101    | false    |
+-------+------+-----------+----------+
|   3   | def  |    102    | false    |
+-------+------+-----------+----------+
|   4   | def  |    103    | false    |
+-------+------+-----------+----------+
|   5   | ghi  |    103    | false    |
+-------+------+-----------+----------+

I would like to write a query to SELECT all duplicates that are not disabled for the same company. i.e. same name and companyid.

e.g.

+-------+------+-----------+----------+
| lobid | name | companyid | disabled |
+-------+------+-----------+----------+
|   1   | abc  |    101    | false    |
+-------+------+-----------+----------+
|   2   | abc  |    101    | false    |
+-------+------+-----------+----------+

Then I would like an UPDATE sql statement to update all the duplicates to disabled true. However it should leave one (any one) row enabled. i.e. so there is just singles and no duplicates.

e.g.

+-------+------+-----------+----------+
| lobid | name | companyid | disabled |
+-------+------+-----------+----------+
|   1   | abc  |    101    | false    |
+-------+------+-----------+----------+
|   2   | abc  |    101    | true     |
+-------+------+-----------+----------+
|   3   | def  |    102    | false    |
+-------+------+-----------+----------+
|   4   | def  |    103    | false    |
+-------+------+-----------+----------+
|   5   | ghi  |    103    | false    |
+-------+------+-----------+----------+

I have tried the following, but it is not correct. I am still seeing dome singles in the result set:

select l.* 
from lob l 
where l.disabled = false  
and l.companyid in (
select companyid
  from lob 
  group by name, companyid, disabled
  having count(*) > 1 and disabled = false)
order by l.companyid, l.name

UPDATE

Thanks to Rory O'Connells answer below, I run this SQL:

UPDATE lob SET disabled = true WHERE lobid IN 
(select l.lobid 
from lob l 
join ( 
    select min(lobId) as lobId, name, companyid
    from lob 
    where disabled = false
    group by name, companyid, disabled
    having count(*) > 1
) k on l.name = k.name and l.companyid = k.companyid
where l.disabled = false  
and l.lobId > k.lobId
order by l.companyid, l.name)

Upvotes: 1

Views: 282

Answers (2)

Maciej Los
Maciej Los

Reputation: 8591

For Postgres version > 9.1 (i think), you can use something like this:

;WITH src as
(
  SELECT lobid, name, companyid, disabled, ROW_NUMBER() OVER(PARTITION BY companyid, name ORDER BY disabled) rn
  FROM lob
)
UPDATE lob
SET disabled = 'true'
FROM src 
WHERE lob.lobid = src.lobid AND src.rn>1;

SELECT *
FROM lob
ORDER BY lobid;

DbFiddle

For further details, please see: PostgreSQL: UPDATE

Upvotes: 2

Rory O'Connell
Rory O'Connell

Reputation: 76

OK, so you need to identify the companies/ids where there is a problem (as you have done):

select min(lobId) as keepId, name, companyid
from lob 
where disabled = false
group by name, companyid, disabled
having count(*) > 1

From these, you only want to leave 1 company record with disabled = false, correct? So you need to update those which are not equal to the minimum row id (or maximum row id, pick one).

So use a join here. Get the minimum value (This also filters out companies where there is no record to update).

select l.* 
from lob l 
join ( 
    select min(lobId) as keepId, name, companyid
    from lob 
    where disabled = false
    group by name, companyid, disabled
    having count(*) > 1
) k on l.name = k.name and l.companyid = k.companyid
where l.disabled = false  
and l.lobId > k.lobId
order by l.companyid, l.name

Then when you're happy, update disabled = true based on l.lobId

Upvotes: 2

Related Questions