Josh K
Josh K

Reputation: 28883

Select a record that has a duplicate

I'd like to select all records from a table (names) where lastname is not unique. Preferably I would like to delete all records that are duplicates.

How would this be done? Assume that I don't want to rerun one query multiple times until it quits.

Upvotes: 1

Views: 309

Answers (4)

Edward Dale
Edward Dale

Reputation: 30133

To find which lastnames have duplicates:

  SELECT lastname, COUNT(lastname) AS rowcount 
    FROM table 
GROUP BY lastname 
  HAVING rowcount > 1

To delete one of the duplicates of all the last names. Run until it doesn't do anything. Not very graceful.

DELETE FROM table 
 WHERE id IN (SELECT id 
                FROM (SELECT * FROM table) AS t 
            GROUP BY lastname 
              HAVING COUNT(lastname) > 1)

Upvotes: 3

Gary
Gary

Reputation: 2916

The fastest and easiest way to delete duplicate records is my issuing a very simple command.

ALTER IGNORE TABLE [TABLENAME] ADD UNIQUE INDEX UNIQUE_INDEX ([FIELDNAME])

This will lock the table, if this is an issue, try:

delete t1 from table1 t1, table2 t2
where table1.duplicate_field= table2.duplicate_field (add more if need ie. and table.duplicate_field2=table2.duplicate_field2)
and table1.unique_field > table2.unique_field
and breakup into ranges to run faster

Upvotes: 2

oezi
oezi

Reputation: 51797

SELECT COUNT(*) as mycountvar FROM names GROUP BY lastname WHERE mycountvar > 1;

and then

DELETE FROM names WHERE lastname = '$mylastnamevar' LIMIT $mycountvar-1

but: why don't you just flag the fielt "lastname" als unique, so it isn't possible that duplicates can come in?

Upvotes: 0

Glennular
Glennular

Reputation: 18215

dup How can I remove duplicate rows?

DELETE names
FROM names
LEFT OUTER JOIN (
   SELECT MIN(RowId) as RowId, lastname 
   FROM names
   GROUP BY lastname 
) as KeepRows ON
   names.lastname = KeepRows.lastname 
WHERE
   KeepRows.RowId IS NULL

assumption: you have an RowId column

Upvotes: 0

Related Questions