The image saved as JPEG with Pillow is different from the original image.

Question

I have a 3-d numpy array and save it using Pillow as JPEG image. When I reloaded the image using Pillow, the resulting numpy array is different. I write a demo code for this:

from PIL import Image
import numpy as  np

file_extension = 'jpeg'
# generate a sample image 
image = range(1, 2*2*3+1)
image = np.uint8(np.array(image).reshape(2,2,3))
print 'image', image

img = Image.fromarray(image, "RGB")
img.save('test.'+file_extension)

# load image 
loaded_image = Image.open('test.'+file_extension)
loaded_image = np.array(loaded_image.convert('RGB'))
print 'loaded image', loaded_image

The output of the code is as follows:

image [[[ 1  2  3]
  [ 4  5  6]]

 [[ 7  8  9]
  [10 11 12]]]
loaded image [[[ 3  4  6]
  [ 3  4  6]]

 [[ 7  8 10]
  [ 8  9 11]]]

The loaded_image is different from the original image. However, if I change the file_extension to be 'png' or 'bmp' etc, The loaded_image will be the same as the original image.

I am wondering if anyone has a similar problem and know why saving image in JPEG format using Pillow gives such a problem?

Mark Setchell · Accepted Answer

The answer is very simple...

JPEG is "lossy". It discards the least obvious details to save space - see Wikipedia entry for JPEG and scroll down looking for "Quantisation". It also doesn't even get started with 16-bit per sample/channel data.

PNG, BMP and TIFF (other than JPEG-encoded TIFF) are lossless - that means you get back exactly what you saved.

GIF is a bit different as it has a limited palette, so you may get back something different from what you saved depending on how many colours your original image had.

If your data is 16-bit per sample/channel, you should probably use PNG, NetPBM or TIFF because BMP can not store 16-bit per sample data - what they call 24-bit means 3 channels of 8-bits each.

The image saved as JPEG with Pillow is different from the original image.

Answers (1)

Related Questions