rajagopalx
rajagopalx

Reputation: 3104

How to read unicode filename in python?

I saw lot and lot of forums about unicode, utf-8 but unable to do this. I am using Windows.

Let's have two folder:

E:\old
---- திருக்குறள்.txt
---- many more unicode named files

E:\new
----

Language : Tamil

Assume I want to move file to E:\new. I cannot access unicode filename properly.

What I Tried

import sys
import os
from shutil import copyfile

path = 'E:/old/'
for root, _, files in os.walk(ur''.join(path)):
    files = [f for f in files]
    copyfile(files[0].encode('utf-8').strip(),'E:/new/')   //just for example

Error:

Traceback (most recent call last):
  File "new.py", line 8, in <module>
    copyfile(files[0].encode('utf-8').strip(),'E:/new/')
  File "C:\Python27\lib\shutil.py", line 82, in copyfile
    with open(src, 'rb') as fsrc:
IOError: [Errno 2] No such file or directory: '\xe0\xae\xa4\xe0\xae\xbf\xe0\xae\xb0\xe0\xaf\x81\xe0\xae\x95\xe0\xaf\x8d\xe0\xae\x95\xe0\xaf\x81\xe0\xae\xb1\xe0\xae\xb3\xe0\xaf\x8d.txt'

Upvotes: 2

Views: 1632

Answers (1)

Mark Tolonen
Mark Tolonen

Reputation: 178419

In Windows use Unicode paths. Since you are using os.walk() you'll need to handle paths correctly to subdirectories, but you could just use shutil.copytree instead. If you don't need subdirectories, use os.listdir.

Here's something that works with os.walk:

import os
import shutil

for path,dirs,files in os.walk(u'old'):
    for filename in files:
        # build the source path
        src = os.path.join(path,filename)
        # build the destination path relative to the source path
        dst = os.path.join('new',os.path.relpath(src,'old'))
        try:
            # ensure the destination directories and subdirectories exist.
            os.makedirs(os.path.dirname(dst))
        except FileExistsError:
            pass
        shutil.copyfile(src,dst)

Upvotes: 2

Related Questions