Reputation: 5
i'm trying to remove duplicates from a table with no primary key. the structure of the table look like this:
|cID|changeDate|changeTime|operator|
|001|01.01.2005|12:00:00 |a |
|001|01.01.2005|12:00:00 |null |
|002|01.01.2005|12:00:00 |a |
|002|01.02.2005|12:00:00 |a |
|002|01.02.2005|12:45:00 |a |
|003|01.01.2005|12:00:00 |a |
|003|01.01.2005|12:00:00 |a |
|003|01.02.2005|12:00:00 |a |
|003|01.03.2005|12:00:00 |a |
|003|01.03.2005|12:00:00 |null |
what i'm trying to achieve is: check for duplicates 'cID', keep the one which has the latest date in 'changeDate'. if two records have the same 'changeDate' keep the one with the latest 'changeTime'. And of this result if there are still duplicates keep the one that has not null in 'operator'.
the above table should look like this:
|cID|changeDate|changeTime|operator|
|001|01.01.2005|12:00:00 |a |
|002|01.02.2005|12:45:00 |a |
|003|01.03.2005|12:00:00 |a |
DB is mysql, engine is innodb. i would like to achieve this without creating a new table.
my sql skills are limited, in fact almost nonexistant. i've been reading and searching for a while, and i'm not getting there...
i've tried different approaches (temp tables, select with inner join)
any help would be much apreciated.
Upvotes: 0
Views: 117
Reputation: 2524
see SQLfiddle demo
Preserve all rows you need in a temporary table:
SELECT t.*
FROM (
SELECT (
CASE cid
WHEN @curCId
THEN @curRow := @curRow + 1
ELSE @curRow := 1 AND @curCId := cid END
) AS rank,
p.*
FROM mytable p,(SELECT @curRow := 0, @curCId := '') r
ORDER BY cid,changedate desc,changetime desc,operator desc
) t
where rank =1
Then delete rows from yourtable
To finish, insert rows from temp table to yourtable
Upvotes: 1