StackOverflowNewbie
StackOverflowNewbie

Reputation: 40653

How to Delete "Duplicate" Records in MySQL?

I need to delete certain records from my table that I consider "duplicates". They're not exactly duplicates as not every column value are the same. Rather, the logic is something like this:

If col_a and col_b have the same value across several rows, and col_c (which is a timestamp) is within, say, 5 minutes of each other, then delete all rows except the row with the earliest timestamp.

Example Data:

id    col_a    col_b     col_c
1     foo      bar       2016-01-01 00:00:00
2     foo      bar       2016-01-01 00:00:12
3     foo      bar       2016-01-01 00:00:22
4     foo      bar       2016-01-05 00:00:00
5     apple    banana    2016-01-01 00:00:00
6     apple    banana    2016-01-05 00:00:00

In the above example, I want to delete id = 2 and id = 3. Is this possible to do in MySQL?

Upvotes: 1

Views: 191

Answers (1)

Deepak Puthraya
Deepak Puthraya

Reputation: 1435

I think this could do the trick

DELETE FROM tab
WHERE ID IN(
select t1.id
FROM tab as t1 JOIN tab as t2
ON t1.col1=t2.col1 AND t1.col2 = t2.col2
WHERE DATE_DIFF(MINUTE, t1.col3, t2.col3) < 5 
AND DATE_DIFF(MINUTE, t1.col3, t2.col3) > 0) 

Join the table and get all the duplicates. In those duplicates select only the ones that satisfy the time constraint. Note: > 0 and not >=0

Upvotes: 1

Related Questions