UzumakiDev
UzumakiDev

Reputation: 1276

I need to remove duplicate images from my wordpress media library

I'm trying to remove duplicate images from my wordpress media library.

The posts themselves aren't duplicate but each attachment image is appearing twice for each post.

I've had a look around but nobody quite has a clear cut answer. Some people have said use something like this:

  <?php 
    $p = get_posts(array('numberposts'=> -1));
    foreach($p as $t) {
      $s = get_children(array('post_type' => 'attachment', 'numberposts' => -1 )); 
      foreach ($s as $u) {
        var_dump($u);
      }
    }
  ?>

But there still seems to be a bit missing, like this brings me a list of attachments but I still don't know how to compare them.

It seems to me that I need to use some sql queries and delete the media files straight from the database. I'm not quite sure how to go about this though.

Theoretically I need to attempt something like find posts, foreach post get attachments, if attachment filename == filename then delete filename.

Help appreciated.

Upvotes: 2

Views: 2627

Answers (1)

Manolis
Manolis

Reputation: 891

Create a file in your theme folder and in that paste either the first or second way solutions:

1st way (Compare against their titles - recommended):

<?php

require('../../../wp-blog-header.php');
global $wpdb;

$querys = $wpdb->get_results(
    "
    SELECT a.ID, a.post_title, a.post_type
    FROM $wpdb->posts AS a
       INNER JOIN (
          SELECT post_title, MIN( id ) AS min_id
          FROM $wpdb->posts
          WHERE post_type = 'attachment'
          GROUP BY post_title
          HAVING COUNT( * ) > 1
       ) AS b ON b.post_title = a.post_title
    AND b.min_id <> a.id
    AND a.post_type = 'attachment'
    ");

echo "<style>td {padding:0 0 10px;}</style>";
echo "<h2>DUPLICATES</h2>\n";
echo "<table>\n";
echo "<tr><th></th><th>Title</th><th>URL</th></tr>\n";

foreach ( $querys as $query )
{
    $attachment_url = wp_get_attachment_url($query->ID);
    $delete_url = get_delete_post_link($query->ID);
    $delete_url = get_delete_post_link($query->ID);
    echo "<tr>
        <td><a style=\"color: #FFF;background-color: #E74C3C;text-decoration: none;padding: 5px;\" target=\"_blank\" href=\"".$delete_url."\">DELETE</a></td>
        <td>".get_the_title($query->ID)."</td>
        <td><a href=\"".$attachment_url."\">".$attachment_url."</a></td>
        </tr>\n";
}

echo "</table>";

?>

2nd way (Compare against their urls) The way it works is, it checks each file against each other. So if you have a file like file.jpg and file1.jpg, then it will catch them as duplicates. Also, if you have a file like file1.jpg and file11.jpg, then it will catch them as duplicates even though they might be completely different files. :

<?php

require('../../../wp-blog-header.php');

$args = array(
    'post_type' => 'attachment',
    'numberposts' => -1,
    'orderby' => 'name',
    'order' => 'ASC',
    'post_status' => null,
    'post_parent' => null, // any parent
);

$newArray = array();

$attachments = get_posts($args);
$attachments2 = $attachments;


if ($attachments){

    foreach($attachments as $post){
        $attachment_url = wp_get_attachment_url($post->ID);
        $delete_url = get_delete_post_link($post->ID);
        $newArray[] = array("att_url" => $attachment_url, "del_url" => $delete_url);
    }

    echo "<table>";
        $newArray2 = $newArray;
        echo "<tr><td><h2>DUPLICATES</h2></td></tr>";
        foreach($newArray as $url1){
            $url_del = $url1['del_url'];
            $url1 = $url1['att_url'];
            $url11 = substr($url1,0,strrpos($url1,"."));
            foreach($newArray2 as $url2){
                $url2 = $url2['att_url'];
                $url2 = substr($url2,0,strrpos($url2,".") - 1);

                if($url2 == $url11)
                {
                    echo "<tr><td><a href=\"".$url_del."\">DELETE</a></td><td><a href=\"".$url1."\">".$url1."</a></td></tr>";
                }
            }
        }
    echo "</table>";
    echo "<table>";
        echo "<tr><td><h2>ALL ATTACHMENTS</h2></td></tr>";

        foreach($attachments as $tst1){
            echo "<tr><td>".$tst1->guid."</td></tr>";
        }
    echo "</table>";
}
?>

Then try to access that file in your browser and it will show you a list of all the duplicates.

Upvotes: 1

Related Questions