turbonerd
turbonerd

Reputation: 1306

Deleting rows from MySQL DB if duplicate

I'm using the following query to display duplicate entries in my database (looks like I've accidentally run the script twice, though it may have been more as it may have been a malicious user).

I'm using the following query, though it doesn't actually satisfy my requirements:

SELECT meta_value, COUNT(meta_value) AS cnt
FROM wp_postmeta
GROUP BY meta_value
HAVING cnt > 1
ORDER BY cnt;

I'd also like the query to check if the duplicate entries have the same post_id and meta_key.

So, for example:

meta_id     post_id     meta_key    meta_value
1           10          size        large
2           10          colour      blue
3           10          size        large
4           11          size        large

meta_id 1 and 3 are duplicate entries.

I would like to delete ALL BUT ONE of such entries.

Is there a way of doing this with a single query? If I could view the rows first to ensure that the query does as expected, that would be great.

Thanks in advance,

Upvotes: 1

Views: 1065

Answers (4)

user158017
user158017

Reputation: 2991

EDIT: nnichols answer is way better.


try something like this - there may be a more efficient way, but off the top of my head it seems to work.

delete from wp_postmeta
where meta_id in 
(select meta_id
from 
     (select meta_key, meta_value, post_id
      from wp_postmeta
      group by meta_key, meta_value, post_id
      having count(*) > 1) problemGroups
inner join wp_postmeta a
      on a.meta_key = problemGroups.meta_key
      and a.meta_value = problemGroups.meta_value
      and a.post_id = problemGroups.post_id) allIDs
and meta_id not in
(select min(meta_id)
from 
     (select meta_key, meta_value, post_id
      from wp_postmeta
      group by meta_key, meta_value, post_id
      having count(*) > 1) problemGroups
inner join wp_postmeta a
      on a.meta_key = problemGroups.meta_key
      and a.meta_value = problemGroups.meta_value
      and a.post_id = problemGroups.post_id
group by problemGroups.meta_key, meta_value, port_id) minIDS

Upvotes: 1

user1191247
user1191247

Reputation: 12973

Try this -

DELETE t2
FROM wp_postmeta t1
INNER JOIN wp_postmeta t2
    ON t1.post_id = t2.post_id
    AND t1.meta_key = t2.meta_key
    AND t1.meta_value = t2.meta_value
    AND t1.meta_id < t2.meta_id

Take note of @RolandBouman's warning about references to the rows you are deleting.

Upvotes: 4

Roland Bouman
Roland Bouman

Reputation: 31961

"I'd also like the query to check if the duplicate entries have the same post_id and meta_key."

then, use those in the group by as well.

SELECT meta_value, COUNT(meta_value) AS cnt
FROM wp_postmeta
GROUP BY post_id, meta_key, meta_value
HAVING cnt > 1
ORDER BY cnt;

"I would like to delete ALL BUT ONE of such entries."

This is unfortunately not so easy in MySQL. (see http://dev.mysql.com/doc/refman/5.5/en/delete.html)

There is something called multi-table DELETE syntax but its useless if you need to join to the same table as the one where you want to delete from. Using a subquery also won't fly, because you can't select from the same table as the one you're deleting from.

Unfortunately the easiest way to do this is to create a temporary table based on the group by query and use that to join to:

CREATE TABLE wp_postmeta_delete
AS
SELECT MIN(meta_id) meta_id
,      post_id 
,      meta_key
,      meta_value
FROM wp_postmeta
GROUP BY post_id, meta_key, meta_value
HAVING count(*) > 1;

DELETE     wp_postmeta.*
FROM       wp_postmeta
INNER JOIN wp_postmeta_delete t2
ON         wp_postmeta.meta_id   != t2.meta_id
AND        wp_postmeta.post_id    = t2.post_id
AND        wp_postmeta.meta_key   = t2.meta_key
AND        wp_postmeta.meta_value = t2.meta_value;

After deleting the rows you can discard the temporary table:

DROP TABLE wp_postmeta_delete;

Note that in many cases just deleting duplicates may not be good enough; If other tables point to duplicate rows, then you should migrate those references to point at the corresponding unique row that you're keeping around.

Upvotes: 2

James Glass
James Glass

Reputation: 4300

I haven't tested, but something like this should work (this assumes your select statement grabs what you want it to)

DELETE FROM wp_postmeta
WHERE meta_id IN (
                SELECT meta_id
                FROM wp_postmeta
                GROUP BY meta_value
                HAVING COUNT(meta_value) > 1
                ORDER BY cnt
                );

Upvotes: 0

Related Questions