Reputation: 877
I've got some MySQL tables with redundant data that I need to remove. For example:
id email date data...
1 [email protected] 2012-01-01 my_data
2 [email protected] 2012-01-01 my_data
3 [email protected] 2012-01-02 my_data
4 [email protected] 2012-01-02 my_data (redundant)
5 [email protected] 2012-01-02 my_data
I need to DELETE the redundant rows, but I'd like to select them first. I found this on StackOverflow, but it requires the email address
SELECT *
FROM `my_table`
WHERE `id` IN (SELECT `id`
FROM `my_table`
where `email` = '[email protected]'
group by `date`
HAVING count(*) > 1)
What query can i use like above that does not use the WHERE qualifier in the embedded query so I can do it fall all email addresses?
The query can be a SELECT query. I don't mind removing the rows manually in PHPMyAdmin.
Upvotes: 2
Views: 825
Reputation: 2877
Another approach is to count the number of occurrences of the date column for each email address in your table:
SELECT `email`, `date`, COUNT(*) FROM `my_table` GROUP BY `date`, `email` HAVING COUNT(*) > 1
+------------------+---------------------+----------+
| email | date | COUNT(*) |
+------------------+---------------------+----------+
| [email protected] | 2012-01-02 00:00:00 | 2 |
+------------------+---------------------+----------+
Upvotes: 0
Reputation: 263933
DELETE FROM tableName
WHERE ID NOT IN
(
SELECT minID
FROM
(
SELECT email, date, MIN(id) minID
FROM tableNAme
GROUP BY email, date
) x
)
or by using JOIN
DELETE a
FROM tableName a
LEFT JOIN (
SELECT minID
FROM (
SELECT email, DATE, MIN(id) minID
FROM tableNAme
GROUP BY email, DATE
) y
) x
ON a.ID = x.minID
WHERE x.minID IS NULL;
The following query only SELECT
duplicated rows for each email and date
SELECT a.*
FROM tableName a
LEFT JOIN
(
SELECT minID
FROM
(
SELECT email, date, MIN(id) minID
FROM tableNAme
GROUP BY email, date
)y
) x ON a.ID = x.minID
WHERE x.minID IS NULL
Upvotes: 7