David
David

Reputation: 139

os.popen().read() - charmap decoding error

I have already read UnicodeDecodeError: 'charmap' codec can't decode byte X in position Y: character maps to <undefined>. While the error message is similar, the code is completely different, because I use os.popen in this question, not open. I cannot use the answers from the other questions to solve this problem.

output = os.popen("dir").read()

This line, which is supposed to assign the output of command "dir" to variable "output", is causing this error:

'charmap' codec can't decode byte 0x88 in position 260: character maps to <undefined>

I think this might be happenning because some files in the folder contain letters such as ł, ą, ę and ć in their names. I have no idea how to fix this though.

Upvotes: 6

Views: 9527

Answers (4)

ERROR
ERROR

Reputation: 11

After some time exploring, I found this solution:

import os
stream = os.popen("dir")
stream._stream.reconfigure(encoding='latin', newline="") # Now the stream is configured in the encoding 'latin'
data = stream.read()

Here we use the _stream attribute of os.popen objects to reconfigure the stream object and read bytes. While this may feel hacky, this is the only solution I have found.

If you have found a better solution, please edit this answer!

Upvotes: 0

If someone used the with-statement with the combination of readline() in python2 like me(for a timezone Util in Windows), it won't work for python3:

with os.popen("tzutil /l") as source:
    key, value = self.get_key_value(source, True)
    while value and key:
        timezones_to_json.append({u"key": key, u"value": value, u"toolTip": key})
        key, value = self,get_key_value(source, False)
return timezones_to_json

def get_key_value(self, source, first=False):
    if not first:
        source.readline()
    value = source.stdout.readline().strip()
    key = source.stdout.readline().strip()
    return key, value

So my changes to python3 were:

  1. like @Josh Lee said I used the subprocess.Popen instead, but than I had an AttributeError: __exit__

  2. So you had to Insert .stdout at the end, so the object in the with-statement has __enter__ and __exit__ methods:

    with subprocess.Popen(['tzutil', '/l'], stdout=subprocess.PIPE).stdout as source:
    

Upvotes: 0

user202729
user202729

Reputation: 3955

In this case, using subprocess.Popen is too general, too verbose and too hard to remember. Use subprocess.check_output instead.

It returns a bytes object, which can be converted to str with decode function.

import subprocess
x = subprocess.check_output(['ls','/'])
print(x.decode('utf-8'))

Try it online!

Upvotes: 3

Josh Lee
Josh Lee

Reputation: 177594

os.popen is just a wrapper around subprocess.Popen along with a io.TextIOWrapper object:

The returned file object reads or writes text strings rather than bytes.

If Python's default encoding doesn't work for you, you should use subprocess.Popen directly.

The underlying issue is that cmd writes ansi garbage by default, even when the output is to a pipe. This behavior may depend on your Windows version.

You can fix this by passing /U flag to cmd:

p = subprocess.Popen('cmd /u /c dir', stdout=subprocess.PIPE)
result = p.communicate()
text = result[0].decode('u16')

Upvotes: 6

Related Questions