Reputation: 2989
In the documentation at https://docs.python.org/3/library/stdtypes.html#bytes.decode
It says that errors='replace'
is a valid option.... But what does it replace the invalid values WITH?
Upvotes: 2
Views: 3495
Reputation: 402783
Follow the documentation to Error Handlers and it will explain that "replace" is applicable to text encodings.
Value:
'replace'
Meaning: Replace with a suitable replacement marker; Python will use the officialU+FFFD
REPLACEMENT CHARACTER for the built-in codecs on decoding, and ‘?’ on encodingMeaning: Replace with a suitable replacement
U+FFFD acts as a filler for bytes that cannot be decoded. It looks like this:
b'ab\xffcd'.decode('utf-8', 'replace')
# 'ab�cd'
Without the "replace" argument, you may get a UnicodeDecodeError
:
b'ab\xffcd'.decode('utf-8')
# UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 2: invalid start byte
Upvotes: 4