shafi97
shafi97

Reputation: 21

error UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 2: invalid start byte

I have a piece of code that does this:

def command(self, s, level=1):
        sub=subprocess.Popen(s, bufsize=0, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True);
        (out, err) = sub.communicate()

I see this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xbe in position 2: invalid start byte

when I try to call the communicate method. The subprocess popen is reading as strings.

In a working condition it should return a tuple (stdoutdata, stderrdata)

Upvotes: 2

Views: 3726

Answers (1)

lenz
lenz

Reputation: 5817

With the universal_newlines=True parameter (which has a more readable alias text=True since Python 3.7), input and output are en-/decoded implicitly by Python. You can tell Python which codec to use through the encoding= parameter. If you don't specify a codec, the same defaults are used as in io.TextIOWrapper.

The default codec depends on a number of factors (OS, locale, Python version), but in your case it is apparently UTF-8. However, your subprocess returns data which is not UTF-8 encoded. So you need to refer to the documentation of that command:

  • Does it return text in a Windows codepage, eg. CP-1252? Then specify this in the encoding= parameter to the subprocess.Popen call.
  • Does it return text at all? If not, omit the universal_newlines parameter and process the binary data returned as bytes objects.

Upvotes: 4

Related Questions