user2181913
user2181913

Reputation: 79

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte

I am using hfcca to calculate cyclomatic complexity for a c++ code. hfcca is a simple python script (https://code.google.com/p/headerfile-free-cyclomatic-complexity-analyzer/). When i am trying to run the script to generate the output in the form of an xml file i am getting following errors :

Traceback (most recent call last):
    "./hfcca.py", line 802, in <module>
    main(sys.argv[1:])
    File "./hfcca.py", line 798, in main
    print(xml_output([f for f in r], options))
    File "./hfcca.py", line 798, in <listcomp>
    print(xml_output([f for f in r], options))
    File "/x/home06/smanchukonda/PREFIX/lib/python3.3/multiprocessing/pool.py", line 652, in next
    raise value
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 434852: invalid continuation byte

Please help me with this..

Upvotes: 7

Views: 43869

Answers (2)

Biashara Employers
Biashara Employers

Reputation: 51

I also had the same problem rendering Markup("""yyyyyy""") but i solved it using an online tool with removed the 'bad' characters. https://pteo.paranoiaworks.mobi/diacriticsremover/

It is a nice tool and works even offline.

Upvotes: 2

monk
monk

Reputation: 981

The problem looks like the file has characters represented with latin1 that aren't characters in utf8. The file utility can be useful for figuring out what encoding a file should be treated as, e.g:

monk@monk-VirtualBox:~$ file foo.txt 
foo.txt: UTF-8 Unicode text

Here's what the bytes mean in latin1:

>>> b'\xe2'.decode('latin1')
'â'

Probably easiest is to convert the files to utf8.

Upvotes: 11

Related Questions