Reputation: 1221
The following code does not seem to read/write binary form correctly. It should read a binary file, bit-wise XOR the data and write it back to file. There are not any syntax errors but the data does not verify and I have tested the source data via another tool to confirm the xor key.
Update: per feedback in the comments, this is most likely due to the endianness of the system I was testing on.
def four_byte_xor(buf, key):
out = ''
for i in range(0,len(buf)/4):
c = struct.unpack("=I", buf[(i*4):(i*4)+4])[0]
c ^= key
out += struct.pack("=I", c)
return out
Call to xortools.py:
from xortools import four_byte_xor
in_buf = open('infile.bin','rb').read()
out_buf = open('outfile.bin','wb')
out_buf.write(four_byte_xor(in_buf, 0x01010101))
out_buf.close()
It appears that I need to read bytes per answer. How would the function above incorporate into the following as the function above manipulate multiple bytes? Or Does it not matter? Do I need to use struct?
with open("myfile", "rb") as f:
byte = f.read(1)
while byte:
# Do stuff with byte.
byte = f.read(1)
For an example the following file has 4 repeating bytes, 01020304:
The data is XOR'd with a key of 01020304 which zeros the original bytes:
Here is an attempt with the original function, in this case 05010501 is the result which is incorrect:
Upvotes: 11
Views: 36642
Reputation: 123393
Here's a relatively easy solution (tested):
import sys
from xortools import four_byte_xor
in_buf = open('infile.bin','rb').read()
orig_len = len(in_buf)
new_len = ((orig_len+3)//4)*4
if new_len > orig_len:
in_buf += ''.join(['x\00']*(new_len-orig_len))
key = 0x01020304
if sys.byteorder == "little": # adjust for endianess of processor
key = struct.unpack(">I", struct.pack("<I", key))[0]
out_buf = four_byte_xor(in_buf, key)
f = open('outfile.bin','wb')
f.write(out_buf[:orig_len]) # only write bytes that were part of orig
f.close()
What it does is pad the length of the data up to a whole multiple of 4 bytes, xor's that with the four-byte key, but then only writes out data that was the length of the original.
This problem was a little tricky because the byte-order of the data for a 4-byte key depends on your processor but is always written with the high-byte first, but the byte order of string or bytearrays is always written low-byte first as shown in your hex dumps. To allow the key to be specified as a hex integer, it was necessary to add code to conditionally compensate for the differing representations -- i.e. to allow the key's bytes can be specified in the same order as the bytes appearing in the hex dumps.
Upvotes: 3
Reputation: 43024
Try this function:
def four_byte_xor(buf, key):
outl = []
for i in range(0, len(buf), 4):
chunk = buf[i:i+4]
v = struct.unpack(b"=I", chunk)[0]
v ^= key
outl.append(struct.pack(b"=I", v))
return b"".join(outl)
I'm not sure you're actually taking the input by 4 bytes, but I didn't try to decipher it. This assumes your input is divisible by 4.
Edit, new function based in new input:
def four_byte_xor(buf, key):
key = struct.pack(b">I", key)
buf = bytearray(buf)
for offset in range(0, len(buf), 4):
for i, byte in enumerate(key):
buf[offset + i] = chr(buf[offset + i] ^ ord(byte))
return str(buf)
This could probably be improved, but it does provide the proper output.
Upvotes: 2