Reputation: 51
Python version: 2.7.3
Filename: test snowman character --☃--.mp3
Ran the following tests, None of them proved successful.
>>> os.path.exist('test snowman character --☃--.mp3')
False
>>> os.path.exist(repr('test snowman character --☃--.mp3'))
False
>>> os.path.isfile('test snowman character --\\xe2\\x98\\x83--.mp3')
False
>>> os.path.isfile(r'test snowman character --\\xe2\\x98\\x83--.mp3')
False
>>> os.path.isfile('test snowman character --☃--.mp3'.decode('utf-8'))
False
Tried to retrieve files with glob, even that test failed.
Objective is to detect and copy this file to another folder, Please Advise.
Upvotes: 5
Views: 2512
Reputation: 59323
The Windows NTFS filesystem uses UTF-16 (just ask Martijn Pieters), so try this:
>>> os.path.exists(u'test snowman character --☃--.mp3'.encode("UTF-16"))
But first make sure the input encoding of the interpreter is correct. print repr(u'test snowman character --☃--.mp3')
should output:
u'test snowman character --\u2603--.mp3'
Note: I am unable to test this as Windows CMD won't let me input snowman symbols. In any case, it turns out Python will do the right thing if you just give it a Unicode string, so the encode call is superfluous. To summarize, I recommend Martijn Pieters' answer.
Upvotes: 1
Reputation: 1121484
Use a unicode value; preferably with a unicode escape sequence:
os.path.isfile(u'test snowman character --\u2603--.mp3')
Python on Windows will use the correct Windows API for listing UTF16 files when you give it a unicode path.
For more information on how Python alters behaviour with unicode vs. bytestring file paths, see the Python Unicode HOWTO.
Upvotes: 3
Reputation: 38956
Literal Unicode strings are supposed to start with u'
, try os.path.exist(u'test snowman character --☃--.mp3')
If you want to use escape sequences it's ur'
, as in os.path.isfile(ur'test snowman character --\\xe2\\x98\\x83--.mp3')
http://docs.python.org/2.7/reference/lexical_analysis.html#strings
Upvotes: 0