Reputation: 55
I am trying to open some configuration files with following command:
f=open(os.path.join(root, name),mode='rt',errors='ignore')
However, I am getting the following error after upgrading to python 3.5.
for line in f:
File "C:\python35-32\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 57: chara
cter maps to <undefined>
This code worked fine when, I ran using python 2.7. I have tried to specify encoding as utf8 or latin1 but none of them are working now. It would be very much helpful if anyone can suggest me a way forward?
It will be ok if I can ignore the error and go to the next line. How can I skip the erroneous part?
Upvotes: 0
Views: 1026
Reputation: 566
You can use codecs.open
import codecs
f = codecs.open(os.path.join(root, name), mode='rt', encoding='utf-8')
for line in f:
#do something
Also, I don't think the problem is with your code but rather with Windows command prompt as its encoding is 'cp1252'. I'd run into this issue long back. Basically, if you run your script on Windows command prompt and as soon as your code executes the print function (to print the unicode data) the program would crash since Windows command prompt is unable to decode and print it.
You can also get around this problem by printing the raw data. That is, change all print function to print("%r" % line)
Upvotes: 0
Reputation: 7787
Try to specify encoding of file open(os.path.join(root, name), encoding='utf-8')
Upvotes: 1