djangodjames
djangodjames

Reputation: 319

Python Pillow doesn't work with some images

I have 30 000 images to check for size, format and some other things.

I've checked all of them except 200 images. These 200 images give an error in Pillow

from PIL import Image
import requests

url = 'https://img.yakaboo.ua/media/wysiwyg/ePidtrymka_desktop.svg'
image = Image.open(requests.get(url, stream=True).raw)

This gives and error:

PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x7fbfbf59c810>

Here are some other images, that give the same error:

https://www.yakaboo.ua/ua/skin/frontend/bootstrap/yakaboo/images/logo/y-logo.png
https://img.yakaboo.ua/media/wysiwyg/ePidtrymka_desktop.svg
https://img.yakaboo.ua/media/wysiwyg/ePidtrymka_desktop_futer.svg
https://www.yakaboo.ua/ua/skin/frontend/bootstrap/yakaboo/images/icons/googleplay.png
https://www.yakaboo.ua/ua/skin/frontend/bootstrap/yakaboo/images/icons/appstore.png

If I download these images - everything works fine. But I need to check them without downloading. Is there any solution?

Upvotes: 2

Views: 554

Answers (1)

AKX
AKX

Reputation: 168824

  1. You're not checking for any errors you might get from requests responses, so chances are you might be trying to identify e.g. an error page.
  2. Pillow doesn't support SVG files (and they don't necessarily have an intrinsic size anyway). You'll need something else to identify them.
  3. You're explicitly asking requests to give you the raw stream, not something that may have been e.g. decompressed if there's a transport encoding. For that y-logo.png, the server responds with a response that has Content-Encoding: gzip, so no wonder you're having a hard time. You might want to just not use stream=True and .raw, but instead read the response into memory, wrap it with io.BytesIO(resp.content) and pass that to Pillow. If that's not an option, you could also write a file-like wrapper around a requests response, but it's likely not worth the effort.
  4. To save a bunch of time (by reusing connections), use a Requests session.

Upvotes: 3

Related Questions