Reputation: 3
I am trying to walk a directory structure (Windows) and the UTF characters are giving me a hassle. Specifically it is adding a backslash on the end of each filename.
import os, sys
f = open('output.txt','wb')
sys.stdout = f
tmp=''.encode('utf-8')
for dirname, dirnames, filenames in os.walk('d:\media'):
# print path to all filenames.
for filename in filenames:
tmp=os.path.join(dirname, filename,'\n').encode('utf-8')
sys.stdout.write(tmp)
Without the '\n' the file is one big long string without the added backslash:
d:\media\dir.txtd:\media\Audio\Acda en de Munnik - Waltzing Mathilda (live).mp3d:\media\Audio\BalladOfMosquito.mp3\
With it I get the following:
d:\media\dir.txt\
d:\media\Audio\Acda en de Munnik - Waltzing Mathilda (live).mp3\
d:\media\Audio\BalladOfMosquito.mp3\
While I can deal with the extra character in the program I am going to read this into I'd rather know why this is happening.
Upvotes: 0
Views: 578
Reputation: 178429
That's not the way to redirect to a file and no need to micro-manage encoding.
.join()
adds a backslash between every element joined, including between filename
and \n
. Let print add the newline as shown below or use .write(tmp + '\n')
.
import os, sys
# Open the file in the encoding you want.
# Use 'with' to automatically close the file.
with open('output.txt','w',encoding='utf8') as f:
# Use a raw string r'' if you use backslashes in paths to prevent accidental escape codes.
for dirname, dirnames, filenames in os.walk(r'd:\media'):
for filename in filenames:
tmp = os.path.join(dirname, filename)
# print normally adds a newline, so just redirect to a file
print(tmp,file=f)
Upvotes: 1