Reputation: 1750
I have a problem, actually a question on how would be the easiest way to delete all the duplicates in a MySQL table based on the first SELECT's name and timestamp.
So, I have this table:
SELECT * FROM search;
+------+--------+------------+--------------------------+-----------------------+
| id | uid | string | cat | timestamp |
+------+--------+------------+--------------------------+-----------------------+
|39523 | 87 |eat to live | a:1:{i:0;s:5:"toate";} | 2013-10-07 17:01:41 |
|39524 | 87 |eat to live | s:6:"author"; | 2013-10-07 17:01:41 |
|39525 | 87 |eat to live | s:6:"people"; | 2013-10-07 17:01:41 |
|39526 | 87 |eat to live | s:7:"company"; | 2013-10-07 17:01:41 |
|39527 | 87 |eat to live | s:6:"author"; | 2013-10-07 17:01:42 |
|39532 | 31 | friends | a:1:{i:0;s:5:"toate";} | 2013-10-07 17:04:17 |
|39533 | 31 | friends | s:6:"people"; | 2013-10-07 17:04:17 |
|39534 | 31 | friends | s:7:"company"; | 2013-10-07 17:04:17 |
|39535 | 31 | friends | s:6:"author"; | 2013-10-07 17:04:17 |
+------+--------+------------+--------------------------+-----------------------+
And I want to have left only the first unique appearance. My solution would be to Select all the rows, go through' them, and for each Row to delete all the rows that match the same string, the same uid and timestamp to be either the same timestamp or +1 second.
Any thoughts / suggestions ? I can do this in PHP and was wondering if there is an easier way to do this directly with an SQL query.
Thanks
Upvotes: 2
Views: 563
Reputation: 562328
MySQL supports multi-table DELETE, so you can do a self-join such that only rows with a greater id are included in the deletion.
DELETE s2
FROM search AS s1 JOIN search AS s2
ON s1.string = s2.string AND s1.uid = s2.uid AND s1.id < s2.id;
If you want to add some logic to the join so that it only deletes if the matching s2
rows are not more distant than 1 second apart from the first, you can do that too:
DELETE s2
FROM search AS s1 JOIN search AS s2
ON s1.string = s2.string AND s1.uid = s2.uid AND s1.id < s2.id;
AND s2.timestamp <= s1.timestamp+1
But if you have many entries in a row that are each 1 second apart, this will delete all but the very first. I'm not sure what you want to have done in that case.
Upvotes: 3
Reputation: 412
You can delete using this code, based on a duplicate is the one which have the same uid and string; this code will left only the first appearance.
DELETE FROM search
WHERE id NOT IN
(SELECT MIN(id) FROM search GROUP BY uid, string);
Upvotes: 0