Reputation: 83
I am writing a Python program to read in a DOS tree command outputted into a text document. When I reach the 533th iteration of the loop, Eclipse gives an error:
Traceback (most recent call last):
File "E:\Peter\Documents\Eclipse Workspace\MusicManagement\InputTest.py", line 24, in <module>
input = myfile.readline()
File "C:\Python33\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3551: character maps to undefined
I have read other posts, and setting the encoding to latin-1 does not resolve this issue, as it returns a UnicodeDecodeError
on another character, and the same with trying to use utf-8.
The following is the code:
import os
from Album import *
os.system("tree F:\\Music > tree.txt")
myfile = open('tree.txt')
myfile.readline()
myfile.readline()
myfile.readline()
albums = []
x = 0
while x < 533:
if not input: break
input = myfile.readline()
if len(input) < 14:
artist = input[4:-1]
elif input[13] != '-':
artist = input[4:-1]
else:
albums.append(Album(artist, input[15:-1], input[8:12]))
x += 1
for x in albums:
print(x.artist + ' - ' + x.title + ' (' + str(x.year) + ')')
Upvotes: 8
Views: 7788
Reputation: 1125148
You need to figure out what encoding tree.com
used; according to this post that could any of the MS-DOS codepages.
You could go through each of the MS-DOS encodings; most of those have a codec in the python standard library. I'd try cp437
and cp500
first; the latter is the MS-DOS predecessor of cp1252 I think.
Pass the encoding to open()
:
myfile = open('tree.txt', encoding='cp437')
You really should look into using os.walk()
instead of using tree.com
for this task though, it'll save you having to deal with issues like these at least.
Upvotes: 9
Reputation: 10172
In this line:
myfile = open('tree.txt')
you should specify the encoding of your file. On windows try:
myfile = open('tree.txt',encoding='cp1250')
Upvotes: 1