mrmagooey
mrmagooey

Reputation: 4982

Python open() unicode filename behaviour different across OSes

With a filename looking like:

filename = u"/direc/tories/español.jpg"

And using open() as:

fp = open(filename, "rb")

This will correctly open the file on OSX (10.7), but on Ubuntu 11.04 the open() function will try to open u"espa\xf1ol.jpg", and this will fail with an IOError.

Through the process of trying to fix this I've checked sys.getfilesystemencoding() on both systems, both are set to utf-8 (although Ubuntu reports uppercase, i.e. UTF-8, not sure if that is relevant). I've also set # -*- coding: utf-8 -*- in the python file, but I'm sure this only affects encoding within the file itself, not any external functions or how python deals with system resources. The file exists on both systems with the eñe correctly displayed.

The end question is: How do I open the español.jpg file on the Ubuntu system?

Edit: The español.jpg string is actually coming out of a database via Django's ORM (ImageFileField), but by the time I'm dealing with it and seeing the difference in behaviour I have a single unicode string which is an absolute path to the file.

Upvotes: 5

Views: 2321

Answers (2)

Marcin
Marcin

Reputation: 49816

It's not enough to simply set the file encoding at the top of your file. Make sure that your editor is using the same encoding, and saving the text in that encoding. If necessary, re-type any non-ascii characters to ensure that your editor is doing the right thing.

If your value is coming from e.g. a database, you will still need to ensure that nowhere along the line is being encoded as non-unicode.

Upvotes: 1

Felix Yan
Felix Yan

Reputation: 15259

This one below should work in both cases:

fp = open(filename.encode(sys.getfilesystemencoding()), "rb")

Upvotes: 2

Related Questions