remove duplicates in table without primary key. several conditions

Question

i'm trying to remove duplicates from a table with no primary key. the structure of the table look like this:

|cID|changeDate|changeTime|operator|
|001|01.01.2005|12:00:00  |a       |
|001|01.01.2005|12:00:00  |null    |
|002|01.01.2005|12:00:00  |a       |
|002|01.02.2005|12:00:00  |a       |
|002|01.02.2005|12:45:00  |a       |
|003|01.01.2005|12:00:00  |a       |
|003|01.01.2005|12:00:00  |a       |
|003|01.02.2005|12:00:00  |a       |
|003|01.03.2005|12:00:00  |a       |
|003|01.03.2005|12:00:00  |null    |

what i'm trying to achieve is: check for duplicates 'cID', keep the one which has the latest date in 'changeDate'. if two records have the same 'changeDate' keep the one with the latest 'changeTime'. And of this result if there are still duplicates keep the one that has not null in 'operator'.

the above table should look like this:

|cID|changeDate|changeTime|operator|
|001|01.01.2005|12:00:00  |a       |
|002|01.02.2005|12:45:00  |a       |
|003|01.03.2005|12:00:00  |a       |

DB is mysql, engine is innodb. i would like to achieve this without creating a new table.

my sql skills are limited, in fact almost nonexistant. i've been reading and searching for a while, and i'm not getting there...

i've tried different approaches (temp tables, select with inner join)

any help would be much apreciated.

Kobi · Accepted Answer

see SQLfiddle demo

Preserve all rows you need in a temporary table:

SELECT t.* 
FROM (
SELECT ( 
    CASE cid 
    WHEN @curCId 
    THEN @curRow := @curRow + 1 
    ELSE @curRow := 1 AND @curCId := cid END
  ) AS rank,
  p.*
FROM      mytable p,(SELECT @curRow := 0, @curCId := '') r
ORDER BY cid,changedate desc,changetime desc,operator desc
) t
where rank =1

Then delete rows from yourtable

To finish, insert rows from temp table to yourtable

remove duplicates in table without primary key. several conditions

Answers (1)

Related Questions