Reputation: 219
I'm trying to read a file's contents and convert them into what is actually stored in memory if I write
file = open("filename","br")
binary = "0b"
for i in file.read():
binary += bin(i)[2:]
will binary
equal the actual value stored in memory?
if so, how can I convert this back into a string?
EDIT: I tried
file = open("filename.txt","br")
binary = ""
for i in file.read():
binary += bin(i)[2:]
stored = ""
for bit in binary:
stored += bit
if len(stored) == 7:
print(chr(eval("0b"+stored)), end="")
stored = ""
and it worked fine until it reached a space and then it became weird signs and mixed-up letters.
Upvotes: 2
Views: 3986
Reputation: 11342
To get a (somewhat) accurate representation of the string as it is stored in memory, you need to convert each character into binary.
Assuming basic ascii (1 byte per character) encoding:
s = "python"
binlst = [bin(ord(c))[2:].rjust(8,'0') for c in s] # remove '0b' from string, fill 8 bits
binstr = ''.join(binlst)
print(s)
print(binlst)
print(binstr)
Output
python
['01110000', '01111001', '01110100', '01101000', '01101111', '01101110']
011100000111100101110100011010000110111101101110
For unicode (utf-8), the length of each character can be 1-4 bytes so it's difficult to determine the exact binary representation. As @Yellen mentioned, it may be easier to just convert the file bytes to binary.
Upvotes: 2