jason
jason

Reputation: 3512

Mysql Duplicate Rows ( Duplicate detected using 2 columns )

How to remove duplicated in this setup?

id    A       B 
----------------
1     apple   2  
2     orange  1       
3     apple   2   
4     apple   1 

In here I want to remove (apple,2) which occurs twice. The id numbers are unique. I would use DISTINCT keyword if it were not. Can I some how make a key out of columns A and B and then use the DISTINCT keyword on that to get what I need ? Many thanks for your replies.

Upvotes: 20

Views: 10357

Answers (6)

Peter M
Peter M

Reputation: 844

Yet another (from http://labs.creativecommons.org/2010/01/12/removing-duplicate-rows-in-mysql/). Add a unique index then delete it:

ALTER IGNORE TABLE mytable ADD UNIQUE INDEX tmpindex (A,B);
ALTER TABLE mytable DROP INDEX tmpindex;

The IGNORE keyword is a mysql extension that makes it drop rows that violate the UNIQUE keyword instead of just failing.

Upvotes: 6

davek
davek

Reputation: 22905

delete from myTable 
where id not in
(select min(id)
from myTable
group by A, B)

i.e. the select in brackets returns the first id for each grouping of A and B; deleting all ids that are not in this set will remove all occurences of an A-plus-B combination that are "subsequent" to its first occurrence.

EDIT: this syntax seems to be problematic: see bug report:

http://bugs.mysql.com/bug.php?id=5037

A possible workaround is to do this:

delete from myTable 
where id not in
(
      select minid from 
      (select min(id) as minid from myTable group by A, B) as newtable
) 

Upvotes: 22

Benjamin Cox
Benjamin Cox

Reputation: 6120

DELETE
FROM mytable
USING mytable, mytable AS vtable
WHERE vtable.id > mytable.id
AND mytable.A = vtable.A
AND mytable.A = vtable.A

Upvotes: 1

BryanD
BryanD

Reputation: 1927

I'm not exactly sure what you're asking here. If you don't want duplicates of the A and B columns, then do just what you mentioned SELECT DISTINCT A, B FROM XXX. Maybe you could post an example of the type of result you would like to see.

Upvotes: 0

Pablo Santa Cruz
Pablo Santa Cruz

Reputation: 181280

You could use a temporary table with the data you want:

insert into temp_table
select min(id), A, B
 group by A, B

Upvotes: 0

Larry Lustig
Larry Lustig

Reputation: 50970

DELETE FROM fruit_table FT1
WHERE EXISTS
(
    SELECT * FROM fruit_table FT2 
    WHERE FT2.fruit_name_column = FT1.fruit_name_column
    AND   FT2.fruit_integer_column = FT1.fruit_integer_column
    AND   FT2.id <> FT1.id
)

This assumes you don't care which of the duplicate records is removed.

Upvotes: 2

Related Questions