chatbottest
chatbottest

Reputation: 403

subprocess "TypeError: a bytes-like object is required, not 'str'"

I'm using this code from a previously asked question a few years ago, however, I believe this is outdated. Trying to run the code, I receive the error above. I'm still a novice in Python, so I could not get much clarification from similar questions. Why is this happening?

import subprocess

def getLength(filename):
  result = subprocess.Popen(["ffprobe", filename],
    stdout = subprocess.PIPE, stderr = subprocess.STDOUT)
  return [x for x in result.stdout.readlines() if "Duration" in x]

print(getLength('bell.mp4'))

Traceback

Traceback (most recent call last):
  File "B:\Program Files\ffmpeg\bin\test3.py", line 7, in <module>
    print(getLength('bell.mp4'))
  File "B:\Program Files\ffmpeg\bin\test3.py", line 6, in getLength
    return [x for x in result.stdout.readlines() if "Duration" in x]
  File "B:\Program Files\ffmpeg\bin\test3.py", line 6, in <listcomp>
    return [x for x in result.stdout.readlines() if "Duration" in x]
TypeError: a bytes-like object is required, not 'str'

Upvotes: 30

Views: 54823

Answers (2)

Martijn Pieters
Martijn Pieters

Reputation: 1121764

subprocess returns bytes objects for stdout or stderr streams by default. That means you also need to use bytes objects in operations against these objects. "Duration" in x uses str object. Use a bytes literal (note the b prefix):

return [x for x in result.stdout.readlines() if b"Duration" in x]

or decode your data first, if you know the encoding used (usually, the locale default, but you could set LC_ALL or more specific locale environment variables for the subprocess):

return [x for x in result.stdout.read().decode(encoding).splitlines(True)
        if "Duration" in x]

The alternative is to tell subprocess.Popen() to decode the data to Unicode strings by setting the encoding argument to a suitable codec:

result = subprocess.Popen(
    ["ffprobe", filename],
    stdout=subprocess.PIPE, stderr = subprocess.STDOUT,
    encoding='utf8'
)

If you set text=True (Python 3.7 and up, in previous versions this version is called universal_newlines) you also enable decoding, using your system default codec, the same one that is used for open() calls. In this mode, the pipes are line buffered by default.

Upvotes: 62

Harshith Thota
Harshith Thota

Reputation: 864

Like the errror says, "Duration" is a string. Whereas, the X is a byte like object as results.stdout.readlines() reads the lines in the output as bytecode and not string.

Hence store "Duration" in a variable, say str_var and encode it into a byte array object using str_var.encode('utf-8').

Refer to [this][1].

[1] : Best way to convert string to bytes in Python 3?

Upvotes: 5

Related Questions