user3682248
user3682248

Reputation:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 377826: invalid start byte

I am getting the following error while executing the below code snippet exactly at the line if uID in repo.git.log():, the problem is in repo.git.log(), I have looked at all the similar questions on Stack Overflow which suggests to use decode("utf-8").

how do I convert repo.git.log() into decode("utf-8")?

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 377826: invalid start byte 

Relavant code:

..................
uID = gerritInfo['id'].decode("utf-8")                                            
if uID in repo.git.log():
        inwslist.append(gerritpatch)      
.....................


Traceback (most recent call last):
  File "/prj/host_script/script.py", line 1417, in <module>
    result=main()
  File "/prj/host_script/script.py", line 1028, in main
    if uID in repo.git.log():
  File "/usr/local/lib/python2.7/dist-packages/git/cmd.py", line 431, in <lambda>
    return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/git/cmd.py", line 802, in _call_process
    return self.execute(make_call(), **_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/git/cmd.py", line 610, in execute
    stdout_value = stdout_value.decode(defenc)
  File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 377826: invalid start byte

Upvotes: 29

Views: 46065

Answers (4)

Jake
Jake

Reputation: 1

0x92 does not exist in the encoding UTF-8. As Exceen stated in his answer 0x92 is used in Windows-1252 as a smart quote. The way to resolve this is to use the windows 1252 encoding or to update the smart quote to a normal quote.

Upvotes: 0

Swayam Siddha Panda
Swayam Siddha Panda

Reputation: 41

After good research, I got the solution. In my case, datadump.json file was having the issue.

  • Simply Open the file in notepad format
  • Click on save as option
  • Go to encoding section below & Click on "UTF-8"
  • Save the file.

Now you can try running the command. You are good to go :)

For your reference, I have attached images below.

Step1

Step2

Step3

Upvotes: 1

Abdul Rehman
Abdul Rehman

Reputation: 5664

Use encoding='cp1252' will solve the issue.

Upvotes: 46

Exceen
Exceen

Reputation: 765

0x92 is a smart quote(’) of Windows-1252. It simply doesn't exist in unicode, therefore it can't be decoded.

Maybe your file was edited by a Windows machine which basically caused this problem?

Upvotes: 23

Related Questions