malen_c
malen_c

Reputation: 25

Python urllib2 Images Distorted

I'm making a program using the website http://placekitten.com, but I've run into a bit of a problem. Using this:

im = urllib2.urlopen(url).read()
f = open('kitten.jpeg', 'w')
f.write(im)
f.close()

The image turns out distorted with mismatched colors, like this:

http://imgur.com/zVg64Kn.jpeg

I was wondering if there was an alternative to extracting images with urllib2. If anyone could help, that would be great!

Upvotes: 2

Views: 218

Answers (3)

Martijn Pieters
Martijn Pieters

Reputation: 1123440

You need to open the file in binary mode:

f = open('kitten.jpeg', 'wb')

Python will otherwise translate line endings to the native platform form, a transformation that breaks binary data, as documented for the open() function:

The default is to use text mode, which may convert '\n' characters to a platform-specific representation on writing and back on reading. Thus, when opening a binary file, you should append 'b' to the mode value to open the file in binary mode, which will improve portability.

When copying data from a URL to a file, you could use shutil.copyfileob() to handle streaming efficiently:

from shutil import copyfileobj

im = urllib2.urlopen(url)
with open('kitten.jpeg', 'wb') as out:
    copyfileobj(im, out)

This will read data in chunks, avoiding filling memory with large blobs of binary data. The with statement handles closing the file object for you.

Upvotes: 4

Blender
Blender

Reputation: 298374

If you're using Windows, you have to open the file in binary mode:

f = open('kitten.jpeg', 'wb')

Or more Pythonically:

import urllib2

url = 'http://placekitten.com.s3.amazonaws.com/homepage-samples/200/140.jpg'
image = urllib2.urlopen(url).read()

with open('kitten.jpg', 'wb') as handle:
    handle.write(image)

Upvotes: 0

Jon S.
Jon S.

Reputation: 1378

Change

f = open('kitten.jpeg', 'w')

to read

f = open('kitten.jpeg', 'wb')

See http://docs.python.org/2/library/functions.html#open for more information. What's happening is that the newlines in the jpeg are getting modified in the process of saving, and opening as a binary file will prevent this.

Upvotes: 1

Related Questions