carlososm
carlososm

Reputation: 83

UnicodeDecodeError with 0xc3 in Python subprocess stdout in macOS

I'm developing a python script to compile LaTeX file either in WSL and macOS, but it fails in subprocess stdout utf-8 codec when I run it in macOS. However, it works in WSL. Both Python versions are 3.6

The code does not have any code/decode sentence, so I think the problem is in an internal call of subprocess stdout

def execute(cmd, pipe):
    if pipe:
        ps = subprocess.Popen(cmd, stdout=subprocess.PIPE)
        output = subprocess.check_output(pipe, stdin=ps.stdout, universal_newlines=True)
        print(colored(output, 'red'), file=sys.stderr)
        ps.wait()
    else:
        output = subprocess.call(cmd)
        print(colored(output, 'red'), file=sys.stderr)


start_time = time.time()
for cmd, pipe in zip(commands, pipes):
    print(colored(cmd, 'green'), file=sys.stderr)
    execute(cmd, pipe)

The output I get is

['pdflatex', '-shell-escape', '--interaction', 'nonstopmode', '-file-line-error', 'besolidary.tex']
Traceback (most recent call last):
  File "compile.py", line 61, in <module>
    execute(cmd, pipe)
  File "compile.py", line 50, in execute
    output = subprocess.check_output(pipe, stdin=ps.stdout, universal_newlines=True)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 405, in run
    stdout, stderr = process.communicate(input, timeout=timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess.py", line 830, in communicate
    stdout = self.stdout.read()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 1335: invalid continuation byte

when in WSL works fine and throw all the commands.

Upvotes: 4

Views: 1642

Answers (1)

user2722968
user2722968

Reputation: 16485

Since you specified universal_newlines=True, Python implicitly expects text-output from the subprocess. As no encoding was given to check_output(), it defaults to the encoding returned by locale.getpreferredencoding(False); this happens to be utf-8.

As it turns out in your case, the subprocess does not actually encode its output in what Python considers the preferred encoding and you get a DecodeError when trying to do so.

If you do expect text-output from the subprocess, you need a way to find out what encoding the subprocess will use (or force it to use). Otherwise, if the output is in fact binary, leave universal_newlines to default.

Upvotes: 5

Related Questions