Li Cooper
Li Cooper

Reputation: 71

How to get the position where UnicodeDecodeError occurred?

How can I get a position of where did UnicodeDecodeError occurred? I found material over here and tried to implement it below. But I just get an error NameError: name 'err' is not defined

I searched all over the internet already and here on StackOverflow, but cannot find any hint how to use it. In python docs it says that this particular exception has start attribute, so it must be possible.

Thank you.

    data = buffer + data
    try:
        data = data.decode("utf-8")
    except UnicodeDecodeError:
        #identify where did the error occure?
        #chunk that piece off -> copy troubled piece into buffer and 
        #decode the good one -> then go back, receive the next chunk of 
        #data and concatenate it to the buffer.

        buffer = err.data[err.start:]
        data = data[0:err.start]
        data = data.decode("utf-8")

Upvotes: 2

Views: 1635

Answers (2)

zondo
zondo

Reputation: 20346

That information is stored in the exception itself. You can get the exception object with the as keyword, and use the start attribute:

while True:
    try:
        data = data.decode("utf-8")
    except UnicodeDecodeError as e:
        data = data[:e.start] + data[e.end:]
    else:
        break

Upvotes: 5

shiva
shiva

Reputation: 2699

In case you just want to ignore the error and decode the rest, you can do:

data = data.decode("utf-8", errors='ignore')

Upvotes: 1

Related Questions