iAdjunct
iAdjunct

Reputation: 2989

What does bytes.decode() with errors='replace' do?

In the documentation at https://docs.python.org/3/library/stdtypes.html#bytes.decode

It says that errors='replace' is a valid option.... But what does it replace the invalid values WITH?

Upvotes: 2

Views: 3495

Answers (1)

cs95
cs95

Reputation: 402783

Follow the documentation to Error Handlers and it will explain that "replace" is applicable to text encodings.

Value: 'replace'
Meaning: Replace with a suitable replacement marker; Python will use the official U+FFFD REPLACEMENT CHARACTER for the built-in codecs on decoding, and ‘?’ on encodingMeaning: Replace with a suitable replacement

U+FFFD acts as a filler for bytes that cannot be decoded. It looks like this:

b'ab\xffcd'.decode('utf-8', 'replace')
# 'ab�cd'

Without the "replace" argument, you may get a UnicodeDecodeError:

b'ab\xffcd'.decode('utf-8')
# UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 2: invalid start byte

Upvotes: 4

Related Questions