Reputation: 33
I've saved in a textfile a unicode string of this format b'\x1e\x80E\xd7\xd4M\x94\xa8\xb4\xf3bl[^' but when I read it from this external textfile, it gets read as a normal string.
I've tried reading the file in binary form, such as open(celesi_file_path,"rb")
fciphertext = open(ciphertext_file_path, "rb")
fkey = open(celesi_file_path,"rb")
celesi = fkey.read()
ciphertext = fciphertext.read()
ciphertext = ciphertext.decode('latin-1')
celesi = celesi.decode('latin-1')
print(type(celesi))
print(type(ciphertext))
print(celesi)
print(ciphertext)
The output is a string as: "b'\x1e\x80E\xd7\xd4M\x94\xa8\xb4\xf3bl[^'" while I am expecting it to be a string of characters which are not in this format
Upvotes: 1
Views: 75
Reputation: 44886
Take a look at this:
>>> data = b'\xd0\x9f\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'
>>> str(data)
"b'\\xd0\\x9f\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'"
So, if you wrote str(data)
to the file, you wrote the slashes and x
s, literally. You didn't write the bytes, you wrote the string representation of these bytes provided by Python. You wrote, in this example, 51 bytes (!) instead of the original 12.
You should've written the bytes themselves:
with open("data.bin", "wb") as f:
f.write(data)
And then open this file in binary mode as well and read the bytes.
Upvotes: 1