Reputation: 29597
I am trying to access an image from a url:
http://www.lifeasastrawberry.com/wp-content/uploads/2013/04/IMG_1191-1024x682.jpg
However, it fails with IOError("cannot identify image file") in the last step. Not sure what is going on or how to fix it. It has worked with many other url images.
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.addheaders = [('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]
opener.addheaders = [('Accept-Encoding', 'gzip,deflate,sdch')]
response = opener.open(image_url,None,5)
img_file = cStringIO.StringIO(response.read())
image = Image.open(img_file)
this url also fails:
http://www.canadianliving.com/img/photos/biz/Greek-Yogurt-Ceaser-Salad-Dressi1365783448.jpg
Upvotes: 1
Views: 1626
Reputation: 7602
The problem is that you're telling your URL retriever to ask for a gzip-encoded result from the server, so the image data that you receive is gzip-encoded. You can solve this by either leaving off the accept-encoding
header from your request, or by decompressing the gzip-encoded result manually :
from PIL import Image
import urllib2
import gzip
import cStringIO
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
opener.addheaders = [('Accept', 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8')]
opener.addheaders = [('Accept-Encoding', 'gzip,deflate,sdch')]
gzipped_file = cStringIO.StringIO(opener.open(url, None, 5).read())
image = Image.open(gzip.GzipFile(fileobj=gzipped_file))
The problem with this approach is that if you accept multiple encodings in your HTTP request, then you'll need to look in the HTTP headers of the result to see which encoding you actually got, and then decode manually based on whatever that value indicates.
I think it's easier to set the accept-encoding header to a value such that you will only accept one encoding (e.g., 'identity;q=1, *;q=0'
or something like that), or go ahead and start using the requests package to do HTTP.
Upvotes: 1