Ragini Dahihande
Ragini Dahihande

Reputation: 684

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position XXX: char

I am trying to read one log file from python script. My program works fine in Linux but I am getting error in windows.After reading some line at particular line number I am getting following error

  File "C:\Python\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 311: char
acter maps to <undefined> 

following is code I am using to read file

with open(log_file, 'r') as log_file_fh:
    for line in log_file_fh:
        print(line)

I have tried to fix it by using different encoding modes as ascii,utf8,utf-8,ISO-8859-1,cp1252,cp850. But still facing same issue. Is there any way to fix this issue.

Upvotes: 9

Views: 33986

Answers (2)

Ragini Dahihande
Ragini Dahihande

Reputation: 684

The log file which I want to read through python script is encoded in western language. I have refereed following link https://docs.python.org/2.4/lib/standard-encodings.html I used 'cp850' as encoding mode and this worked for me

with open(log_file, 'r',encoding='cp850') as log_file_fh:
    for line in log_file_fh:
        print(line)

But for Western Europe lots of codec are available on that site. I think this is not correct solution because most of the developers are suggesting not use to 'cp850' mode

The best way to handle encoding error is add errors argument while opening the file and give 'ignore' as property.It will ignore that special character we are not able to decode.In my case this option is OK because i don't want to read entire content of file.I just want some specific log.

with open(log_file, 'r',errors='ignore') as log_file_fh:
    for line in log_file_fh:
        print(line)

Upvotes: 15

Roy Holzem
Roy Holzem

Reputation: 870

EDIT: open your file in binary mode as suggested: with open(log_file, 'rb')

then in your code decode utf-8:

with open(log_file, 'r') as log_file_fh:
    for line in log_file_fh:
        line = line.decode('utf-8')
        print(line)

Upvotes: -2

Related Questions