Reputation: 726
I am running an image classification task in Python. As part of pre-processing, I need to reshape all images to the same dimensions. Through doing this, I have noticed a peculiarity that some jpegs and pngs have a third-dimension, yet some do not. Why is this the case? How do I go forward in terms of normalizing the data?
The images are all color images, and even if I download the images to my computer, I get the same shapes.
from PIL import Image
import requests
from io import BytesIO
import base64
import numpy as np
for url in [
r'https://c7.uihere.com/files/35/692/872/wikimedia-commons-measuring-scales-clip-art-orthodontist-thumb.jpg',
r'https://thedesignlove.com/wp-content/uploads/2018/02/297-Food-Stop-Logo-Template.jpg',
r'https://upload.wikimedia.org/wikipedia/commons/f/ff/BTS_logo_%282017%29.png',
]:
response = requests.get(url)
img = Image.open(BytesIO(response.content))
print(np.asarray(img).shape)
data = '''R0lGODlhDwAPAKECAAAAzMzM/////wAAACwAAAAADwAPAAACIISPeQHsrZ5ModrLlN48CXF8m2iQ3YmmKqVlRtW4MLwWACH+H09wdGltaXplZCBieSBVbGVhZCBTbWFydFNhdmVyIQAAOw=='''
img = Image.open(BytesIO(base64.b64decode(data)))
print(np.asarray(img).shape)
The output is:
(310, 310)
(600, 650, 3)
(1800, 1800, 4)
(15, 15)
As you can see, sometimes the 3rd dimension is not there, and sometimes even when it is there, it is not a consistent number.
Upvotes: 0
Views: 81
Reputation: 349
If you'll download those images to your machine, and take a look in the images details you will see that their "colors" (channels) count and sizes are different. There are programmatic ways to do it, but if you're on windows, you can just right-click and then select "details".
The image "https://c7.uihere.com/files/35/692/872/wikimedia-commons-measuring-scales-clip-art-orthodontist-thumb.jpg" is a 310 x 310 single channel ( 8 bit) image.
The image "https://thedesignlove.com/wp-content/uploads/2018/02/297-Food-Stop-Logo-Template.jpg" is a 650x600 3 channels (24 bit) image.
The image "https://upload.wikimedia.org/wikipedia/commons/f/ff/BTS_logo_%282017%29.png" seems to be 1800x1800 4 channel (32 bit) image. Possibly in RGBA format (A is "alpha" channel, usually used to describe opacity/transparency level).
So basically, all of the data that you see as output seems correct, I don't see any problem here. (note - I didn't bother to look at your fourth case - the raw data image)
Upvotes: 3