Maddrax
Maddrax

Reputation: 43

Delete Duplicate Rows in MySQL (Wordpress, Comments)

I have a plethora of duplicate comments in my Wordpress database, specifically table wp_comments. Of course, those comments have a different IDs. I'd now like to de-dupe those comments based on the field comment_date which would identify all comments posted on the same date and time. I don't care which one of the duplicates remain.

What SQL query do I have to use to achieve this?

Thanks!

EDIT: I don't want to delete a specific comment date across the table, instead I want the database to scan for duplicate dates and remain with only one entry.

Upvotes: 1

Views: 797

Answers (3)

K. W.
K. W.

Reputation: 13

The current top/accepted answer deletes all duplicate comments; however it runs into trouble where the comment_content contains strings that require escaping. E.g., "I'm so happy" entered as a comment twice -- won't be deleted because the comment content comparison ends early due to the ' in the comment body. $wpdb->_real_escape (docs) helped.

Fiddled with the accepted answer, and came up with the following. This was for a WP Multisite, also, so it loops through all of the multisite children as well. You can cut out the bit in the first $blog_ids loop and modify the $blog_id variable if it's for a single site.

Sidenote, I also would suggest adding this as a WP CLI command to perform this work rather than embedding it into a file on the server. Slightly more flexible, and execution can also be stopped immediately if things go sideways.

function delete_duplicates() {
    global $wpdb;
    $blog_ids = $wpdb->get_results(
        "SELECT blog_id FROM {$wpdb->base_prefix}blogs", ARRAY_A
    );

    foreach ( $blog_ids as $blog ) :
            $blog_id = $blog['blog_id'];

            $sql = $wpdb->prepare(
                "SELECT comment_author,comment_content,comment_ID FROM {$wpdb->base_prefix}%d_comments", $blog_id
            );
            $all_comments = $wpdb->get_results( $sql );
            $duplicate_comments = [];

            if ( ! empty( $all_comments ) ) :

                foreach ( $all_comments as $key => $comment ) :
                    $comment_id     = $comment->comment_ID;
                    $comment_author = $comment->comment_author;
                    //escape comment content when fetching so it compares appropriately in the $sql prepared statement
                    $escaped_content = $wpdb->_real_escape ( $comment->comment_content );

                    // find all comments with content matching current comment content and author, except current comment ID
                    $sql = $wpdb->prepare(
                        "SELECT comment_ID FROM {$wpdb->base_prefix}%d_comments WHERE comment_content = '" . $escaped_content . "' AND comment_ID != %d AND comment_author = %s", $blog_id, $comment_id, $comment_author
                    );

                    $duplicate_comments[] = $wpdb->get_var( $sql );

                endforeach;

                // only add duplicate comment IDs once into array
                $duplicate_comments = array_unique( $duplicate_comments );

                // actually deleting comments that are duplicated
                if ( ! empty( $duplicate_comments ) ) :
                    foreach ( $duplicate_comments as $duplicate_key => $duplicate_comment ) :

                        if ( ! empty ( $duplicate_comment ) ) :
                            $table  = $wpdb->prefix . $blog_id . '_comments';
                            $delete = $wpdb->delete( $table, [ 'comment_ID' => $duplicate_comment ], [ '%d' ] );

                        endif;

                    endforeach;
                endif;
            endif;
    endforeach;
}

Upvotes: 0

heathhettig
heathhettig

Reputation: 115

You could do a select all query and then loop through those. While in the loop do a query that delete anything that is the same and doesn't have the ID of current index. Backup first.

Update:

I prefer to keep this kind of code in a separate file in the root directory.

SO make a new file in root and call it whatever you want and then add this code. Run the file AFTER YOU BACKUP your comment and comment meta tables.

You could do a select all query and then loop through those. While in the loop do a query that  delete anything that is the same and doesn't have the ID of current index. Backup first.

Update:

I prefer to keep this kind of code in a separate file in the root directory.

SO make a new file in root and call it whatever you want and then add this code. Run the file AFTER YOU BACKUP your comment and comment meta tables.

<?php 
require('./wp-load.php');
global $wpdb; // loads the DB object

$comments = $wpdb->get_results("SELECT * FROM ".$wpdb->prefix."comments");

foreach((array)$comments as $key => $comment)
{
    $id_to_check = $comment->comment_ID; // keep this comment ID
    $get_dupes = $wpdb->get_results("SELECT * FROM ".$wpdb->prefix."comments WHERE comment_content = '".$comment->comment_content."' AND comment_ID != $id_to_check OR comment_date = '".$comment->comment_date."' AND comment_ID != '".$id_to_check."' ");

    foreach((array)$get_dupes as $dkey => $dupe)
    {
         $wpdb->query("DELETE FROM ".$wpdb->prefix."commentmeta WHERE comment_id = '".$dupe->comment_ID."'"); // delete duplicate comment meta
    }

    $wpdb->query("DELETE FROM ".$wpdb->prefix."comments WHERE comment_ID = '".$dupe->comment_ID."'"); // delete duplicate comment

}
echo 'all done!'
?>

Upvotes: 1

NO-ONE_LEAVES_HERE
NO-ONE_LEAVES_HERE

Reputation: 127

first count number of comments ...

example : SELECT COUNT(*) FROM wp_comments WHERE comment_date='blahblah'

then store the result in a variable ... for example $comment_count then ...

DELETE FROM wp_comments WHERE comment_date='blahblah' LIMIT N 

replace N with $comment_count-1

Upvotes: 0

Related Questions