chew socks
chew socks

Reputation: 1446

Images opened in Pillow and OpenCV are not equivelant

I downloaded a test image from Wikipedia (the tree seen below) to compare Pillow and OpenCV (using cv2) in python. Perceptually the two images appear the same, but their respective md5 hashes don't match; and if I subtract the two images the result is not even close to solid black (the image shown below the original). The original image is a JPEG. If I convert it to a PNG first, the hashes match.

The last image shows the frequency distribution of how the pixel value differences.

As Catree pointed out my subtraction was causing integer overflow. I updated to converting too dtype=int before the subtraction (to show the negative values) and then taking the absolute value before plotting the difference. Now the difference image is perceptually solid black.

This is the code I used:

from PIL import Image
import cv2
import sys
import md5
import numpy as np

def hashIm(im):
    imP = np.array(Image.open(im))

    # Convert to BGR and drop alpha channel if it exists
    imP = imP[..., 2::-1]
    # Make the array contiguous again
    imP = np.array(imP)
    im = cv2.imread(im)

    diff = im.astype(int)-imP.astype(int)

    cv2.imshow('cv2', im)
    cv2.imshow('PIL', imP)
    cv2.imshow('diff', np.abs(diff).astype(np.uint8))
    cv2.imshow('diff_overflow', diff.astype(np.uint8))

    with open('dist.csv', 'w') as outfile:
        diff = im-imP
        for i in range(-256, 256):
            outfile.write('{},{}\n'.format(i, np.count_nonzero(diff==i)))

    cv2.waitKey(0)
    cv2.destroyAllWindows()

    return md5.md5(im).hexdigest() + '   ' + md5.md5(imP).hexdigest()

if __name__ == '__main__':
    print sys.argv[1] + '\t' + hashIm(sys.argv[1])

Original photo of a tree (from Wikipedia "Tree" article)

Frequency distribution updated to show negative values.

Updated difference


This is what I was seeing before I implemented the changes recommended by Catree.

Difference

Dist

Upvotes: 4

Views: 3050

Answers (1)

Catree
Catree

Reputation: 2517

The original image is a JPEG.

JPEG decoding can produce different results depending on the libjpeg version, compiler optimization, platform, etc.

Check which version of libjpeg Pillow and OpenCV are using.

See this answer for more information: JPEG images have different pixel values across multiple devices or here.

BTW, (im-imP) produces uint8 overflow (there is no way to have such a high amount of large pixel differences without seeing it in your frequency chart). Try to cast to int type before doing your frequency computation.

Upvotes: 5

Related Questions