Reputation: 2640
I am not a unicode expert, I read similar posts without any conclusive solution. I need a snippet to read some files with Greek characters. My files have names like
20.10.2011 Ισοζύγιο Πληρωμών- Αύγουστος 2011.xls
I have a generator function that yields filenames:
# -*- coding:utf-8 -*-
import os
import glob
def filesInDir(directory, mask='*.*'):
for root, dir, files in os.walk(directory):
for file in glob.glob(os.path.join(root, mask)):
yield file
Calling this:
for file in filesInDir(directory=r'.'):
with open(file,'r') as f:
print f
gives
IOError: [Errno 22] invalid mode ('r') or filename: '.\\20.10.2011 ?s?????? ?????\xb5??- ?????st?? 2011.xls'
How do a create a valid file object using these sort of filenames?
Upvotes: 2
Views: 1336
Reputation: 336128
You need to make sure that you call os.walk()
with a Unicode string, or it will silently change non-ASCII letters to ASCII (or change them to ?
as you've observed).
So do
for file in filesInDir(directory=u'.'):
with open(file,'r') as f:
print f
and
def filesInDir(directory, mask=u'*.*'):
for root, dir, files in os.walk(directory):
for file in glob.glob(os.path.join(root, mask)):
yield file
See also this similar question.
Upvotes: 5