tett
tett

Reputation: 605

How can I delete rows with a few identical values in MySQL?

I have a table with about 11 fields (columns), and I know for a fact that there are multiple repeating entries, but they are repeating only within few fields. What I want to do is that if two or more rows have identical ProviderName and Address in lowercase, and identical StateID, then I want to delete those extra rows, and leave just one copy of it. I have something like this, but I have no idea how to complete it.

SET SQL_SAFE_UPDATEs=0;
DELETE FROM providers
WHERE LCASE(ProviderName), LCASE(Address), StateID;

UPDATE:

SET SQL_SAFE_UPDATES=0;
DELETE p1.* 
FROM providers AS p1
JOIN providers AS p2 
ON LCASE(p1.ProviderName) = LCASE(p2.ProviderName)
AND LCASE(p1.Address) = LCASE(p2.Address)
AND p1.Zip = p2.Zip
AND p1.StateID = p2.StateID
AND p1.ProviderId > p2.ProviderId

Upvotes: 0

Views: 71

Answers (2)

Barmar
Barmar

Reputation: 780688

This answer assumes you have a unique id field that distinguishes the rows that have duplicate values. It will keep the row with the lowest id.

DELETE p1.*
FROM providers AS p1
JOIN providers AS p2 
ON LCASE(p1.ProviderName) = LCASE(p2.ProviderName)
AND LCASE(p1.Address) = LCASE(p2.Address)
AND p1.StateID = p2.StateID
AND p1.id > p2.id

Note that if your table uses case-insensitive collation, you don't need to call LCASE.

A more efficient query might be:

DELETE p1.*
FROM providers AS p1
JOIN (SELECT LCASE(ProviderName) AS Name, LCASE(Address) AS Addr, StateID, MIN(id) AS id
      FROM providers
      GROUP BY Name, Addr, StateID
      HAVING COUNT(*) > 1) AS p2
ON LCASE(p1.ProviderName) = p2.Name
AND LCASE(p1.Address) = p2.Addr
AND p1.StateID = p2.StateID
AND p1.id > p2.id

The cross-product is much smaller because we first group all the providers with identical values. Also, the HAVING clause filters out all the non-duplicates.

Upvotes: 1

RuslanN
RuslanN

Reputation: 408

To leave entry with highest id:

SET SQL_SAFE_UPDATES = 0;
DELETE n1 FROM provider n1, provider n2 WHERE n1.id < n2.id AND   
LCASE(n1.ProviderName) = LCASE(n2.ProviderName) and LCASE(n1.Address) =   
LCASE(n2.Address)   
and n1.StateId = n2.StateId

Upvotes: 0

Related Questions