Reputation: 1824
I am getting UnicodeDecodeError
with the traceback.print_exc(file=sys.stdout)
. I am using Python3.4 and did not get the problem with Python2.7.
Am I missing something here? How can I make sure that sys.stdout
passes the correct encoded/decoded to the traceback.print_exc()
?
My code looks something similar to this:
try:
# do something which might throw an exception
except Exception as e:
# do something
traceback.print_exc(file=sys.stdout) # Here I am getting the error
Error log:
traceback.print_exc(file=sys.stdout)
File "C:\Python34\lib\traceback.py", line 252, in print_exc
print_exception(*sys.exc_info(), limit=limit, file=file, chain=chain)
File "C:\Python34\lib\traceback.py", line 169, in print_exception
for line in _format_exception_iter(etype, value, tb, limit, chain):
File "C:\Python34\lib\traceback.py", line 153, in _format_exception_iter
yield from _format_list_iter(_extract_tb_iter(tb, limit=limit))
File "C:\Python34\lib\traceback.py", line 18, in _format_list_iter
for filename, lineno, name, line in extracted_list:
File "C:\Python34\lib\traceback.py", line 65, in _extract_tb_or_stack_iter
line = linecache.getline(filename, lineno, f.f_globals)
File "C:\Python34\lib\linecache.py", line 15, in getline
lines = getlines(filename, module_globals)
File "C:\Python34\lib\linecache.py", line 41, in getlines
return updatecache(filename, module_globals)
File "C:\Python34\lib\linecache.py", line 127, in updatecache
lines = fp.readlines()
File "C:\Python34\lib\codecs.py", line 313, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 5213: invalid continuation byte
Upvotes: 0
Views: 1375
Reputation: 1121764
The traceback module wants to include source code lines with the traceback. Normally, a traceback consists only of pointers to source code, not the source code itself, as Python has been executing the compiled bytecode. In the bytecode are hints as to exactly what source code line the bytecode came from.
To then show the sourcecode, the actual source is read from disk, using the linecache
module. This also means that Python has to determine the encoding for those files too. The default encoding for a Python 3 source file is UTF-8, but you can set a PEP 263 comment to let Python know if you are deviating from that.
Because source code is read after the code is already loaded and a traceback took place, it is possible that you changed the source code after starting the script, or there was a byte cache file (in a __pycache__
subdirectory) that appeared to be fresh but was no longer matching your source files.
Either way, when you started the script, Python was able to re-use a bytecache file or read the source code just fine and run your code. But when the traceback was being printed, at least one of the source code files was no longer decodable as UTF-8.
If you can reliably reproduce the traceback (so start the Python script again without encoding problems but printing the traceback fails), it is most likely a stale bytecode file somewhere, one that could even hold pointers to a filename that now contains nothing but binary data, not plain source.
If you know how to use the pdb
module, add a pdb.set_trace()
call before the traceback.print_exc()
call and trace what filenames are being loaded from by the linecache
module.
Otherwise edit your C:\Python34\lib\traceback.py
file and insert a print('Loading {} from the linecache'.format(filename))
statement just before the linecache.checkcache(filename)
line in the _extract_tb_or_stack_iter
function.
Upvotes: 3