How can I delete rows with a few identical values in MySQL?

Question

I have a table with about 11 fields (columns), and I know for a fact that there are multiple repeating entries, but they are repeating only within few fields. What I want to do is that if two or more rows have identical ProviderName and Address in lowercase, and identical StateID, then I want to delete those extra rows, and leave just one copy of it. I have something like this, but I have no idea how to complete it.

SET SQL_SAFE_UPDATEs=0;
DELETE FROM providers
WHERE LCASE(ProviderName), LCASE(Address), StateID;

UPDATE:

SET SQL_SAFE_UPDATES=0;
DELETE p1.* 
FROM providers AS p1
JOIN providers AS p2 
ON LCASE(p1.ProviderName) = LCASE(p2.ProviderName)
AND LCASE(p1.Address) = LCASE(p2.Address)
AND p1.Zip = p2.Zip
AND p1.StateID = p2.StateID
AND p1.ProviderId > p2.ProviderId

Barmar · Accepted Answer

This answer assumes you have a unique id field that distinguishes the rows that have duplicate values. It will keep the row with the lowest id.

DELETE p1.*
FROM providers AS p1
JOIN providers AS p2 
ON LCASE(p1.ProviderName) = LCASE(p2.ProviderName)
AND LCASE(p1.Address) = LCASE(p2.Address)
AND p1.StateID = p2.StateID
AND p1.id > p2.id

Note that if your table uses case-insensitive collation, you don't need to call LCASE.

A more efficient query might be:

DELETE p1.*
FROM providers AS p1
JOIN (SELECT LCASE(ProviderName) AS Name, LCASE(Address) AS Addr, StateID, MIN(id) AS id
      FROM providers
      GROUP BY Name, Addr, StateID
      HAVING COUNT(*) > 1) AS p2
ON LCASE(p1.ProviderName) = p2.Name
AND LCASE(p1.Address) = p2.Addr
AND p1.StateID = p2.StateID
AND p1.id > p2.id

The cross-product is much smaller because we first group all the providers with identical values. Also, the HAVING clause filters out all the non-duplicates.

How can I delete rows with a few identical values in MySQL?

Answers (2)

Related Questions