Reputation: 326
How do I turn a string into a bytes object as is, i.e. without encoding it? I can't use .encode()
here, because it's corrupting my binary file after saving.
filedata = pathlib.Path('file.bin').read_bytes()
# since i can't modify a bytes object, i should convert it to a string, should I?
data = ''
for i in filedata:
data += chr(i) if isinstance(i, int) else i
data[3] = '\x01'
data += '\x58\x02\x0C\x80\x61\x39\x56\x18\x55\x61\x89\x42\x42\x16\x46\x17\x54\x70\x10\x58\x60\x10\x10\x01\x75\x10\xF0\xC0\x00\x01\x00\x02\x00\xC0\x00\xD0\x00\x01\x00\xC4\x00\x01\x00\x02\x00\x01\x00\x00\x02\x00\x00\x00'
pathlib.Path('result.bin').write_bytes(data.encode()) # doesn't work as it should
So instead of this:
58 02 0C 80 61 39 56 18 55 61 89 42 42 16 46 17 54 70 10 58 60 10 10 01 75 10 F0 C0 00 01 00 02 00 C0 00 D0 00 01 00 C4 00 01 00 02 00 01 00 00 02 00 00 00
I get this:
58 02 0C C2 80 61 39 56 18 55 61 C2 89 42 42 16 46 17 54 70 10 58 60 10 10 01 75 10 C3 B0 C3 80 00 01 00 02 00 C3 80 00 C3 90 00 01 00 C3 84 00 01 00 02 00 01 00 00 02 00 00 00
I tried modifying a bytes object itself, but I'm always getting that error:
TypeError: 'bytes' object does not support item assignment
Upvotes: 1
Views: 1481
Reputation: 23131
Solved (thanks to John):
filedata = bytearray(pathlib.Path(sys.argv[1]).read_bytes())
# filedata = bytearray(open(sys.argv[1], 'rb').read()) also works
filedata[1] = 255 # modifying a single byte (0 - 255)
filedata[0:1] = b'\xff' # inserting bytes
filedata.extend(255) # appending one single byte
filedata.extend(filedata2) # appending another array of bytes (bytearray object)
filedata.extend(b'\xff\xff') # appending bytes
filedata.extend([255, 255]) # appending bytes too
pathlib.Path(sys.argv[1]).write_bytes(filedata) # write data to a file
# open(sys.argv[1], 'rb').write(filedata) should work too
This was originally added to revision 5 of the question.
Upvotes: 0
Reputation: 4846
How do I turn a string into a bytes object AS IS, i.e. without encoding it?
You can't. That's a contradiction of terms — as of Python 3.
A string is a sequence of text characters. Think letters, punctuation, white-space, even control characters. A bytes object is a sequence of 8-bit numbers. How the two sequences are related is a question of encoding. There is no way around it.
Text characters should be thought of as abstract entities. The letter A, for example, simply exists. There is no number associated with it per se. (Internally, it is represented by a Unicode code point, which is a number, but that's an implementation detail.)
In the code above, you're reading bytes and you're writing bytes, and in between you want to manipulate the byte stream: change one of the numbers, append others.
Python bytes
are no different from str
in that regard: they are both immutable types. If you did the same as above but with a string, you'd get the same kind of error:
>>> s = 'abcd'
>>> s[3] = 'x'
TypeError: 'str' object does not support item assignment
That is, in-place character manipulation is not supported for strings. There are other ways to achieve the same result though. In-place byte manipulation, on the other hand, is supported — arguably because it's a use case that is more common than for strings. You just need to use bytearray
instead of bytes
:
>>> data = bytearray(b'\x00\x01\x02\x03\x04')
>>> data[3] = 255
>>> print(data)
bytearray(b'\x00\x01\x02\xff\x04')
Which you can then write to a file without any encoding whatsoever:
pathlib.Path('result.bin').write_bytes(data)
(Note that bytes
literals must be prefixed with b
.)
Upvotes: 1