EZC404
EZC404

Reputation: 25

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 106: character maps to <undefined>

im trying to merge multiple ts file into one if i try to run it it gives me the

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 106: character maps to

count = 76
main = "tsPath/seg_768x432_725648_"

for x in range(count):
    z = main+str(x)+".ts"
    print(z)
    with open(output, 'wb') as f:
        with open(z) as fh:
            f.write(fh.read())

Upvotes: 1

Views: 4481

Answers (3)

user13372696
user13372696

Reputation:

Okay, I see the problem now.

Solution

Assuming that the files exist on your computer, the following code will work correctly:

count = 76
main = "tsPath/seg_768x432_725648_"

with open(output, 'wb') as f:
    for x in range(count):
        z = main+str(x)+".ts"
        print(z)
        with open(z, 'rb') as fh:
            f.write(fh.read())

What changes were made and why?

  1. Notice that the loop is inside the with block of open(output, 'wb') not the other way round. This needs to be done because every time you open a file in wb mode, all the data that existed in the file earlier is erased (same with w mode as well). So, we only open the output file for writing once. This will allow you to accumulate the data from the files you are reading (which is what you want to do).

  2. The mode parameter of the open function is set to r(read text) by default. You are reading binary data so use rb mode. (Other answers got this point right but missed point #1).

Upvotes: 2

n.qber
n.qber

Reputation: 384

The first file you opened was opened in "wb" mode but the second file, as it was not specified was opened in "r" and the UnicodeDecodeError happens because the decoder can't convert (understand) some bytes to a string in this codec, so to solve that, just change the

with open(z) as fh:

to

with open(z, "rb") as fh:

but if you want to copy files I suggest you use shutil module as following:

import shutil

count = 76
main = "tsPath/seg_768x432_725648_"

for x in range(count):
    z = main+str(x)+".ts"
    print(z)
    shutil.copy(z, output)

Upvotes: 2

Thierry Lathuille
Thierry Lathuille

Reputation: 24232

You are opening z in text mode (that is the default), and Python tries to decode it with an encoding in which 0x8f doesn't correspond to any character (that could be CP1252 if you're on Windows)

As you mean to manipulate binary data, as you open output in binary mode, open z in binary mode also:

with open(z, 'rb') as fh:

Upvotes: 2

Related Questions