k0ss
k0ss

Reputation: 105

Can't reproduce working C bitwise encoding function in Python

I'm reverse engineering a proprietary network protocol that generates a (static) one-time pad on launch and then uses that to encode/decode each packet it sends/receives. It uses the one-time pad in a series of complex XORs, shifts, and multiplications.

I have produced the following C code after walking through the decoding function in the program with IDA. This function encodes/decodes the data perfectly:

void encodeData(char *buf)
{
    int i;
    size_t bufLen = *(unsigned short *)buf;
    unsigned long entropy = *((unsigned long *)buf + 2);
    int xorKey = 9 * (entropy ^ ((entropy ^ 0x3D0000) >> 16));
    unsigned short baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;

    //Skip first 24 bytes, as that is the header
    for (i = 24; i <= (signed int)bufLen; i++)
        buf[i] ^= byteTable[((unsigned short)i + baseByteTableIndex) & 2047];
}

Now I want to try my hand at making a Peach fuzzer for this protocol. Since I'll need a custom Python fixup to do the encoding/decoding prior to doing the fuzzing, I need to port this C code to Python.

I've made the following Python function but haven't had any luck with it decoding the packets it receives.

def encodeData(buf):
    newBuf = bytearray(buf)
    bufLen = unpack('H', buf[:2])
    entropy = unpack('I', buf[2:6])
    xorKey = 9 * (entropy[0] ^ ((entropy[0] ^ 0x3D0000) >> 16))
    baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
    #Skip first 24 bytes, since that is header data
    for i in range(24,bufLen[0]):
        newBuf[i] = xorPad[(i + baseByteTableIndex) & 2047]
    return str(newBuf)

I've tried with and without using array() or pack()/unpack() on various variables to force them to be the right size for the bitwise operations, but I must be missing something, because I can't get the Python code to work as the C code does. Does anyone know what I'm missing?

In case it would help you to try this locally, here is the one-time pad generating function:

def buildXorPad():
    global xorPad
    xorKey = array('H', [0xACE1])
    for i in range(0, 2048):
        xorKey[0] = -(xorKey[0] & 1) & 0xB400 ^ (xorKey[0] >> 1)
        xorPad = xorPad + pack('B',xorKey[0] & 0xFF)

And here is the hex-encoded original (encoded) and decoded packet.

Original: 20000108fcf3d71d98590000010000000000000000000000a992e0ee2525a5e5

Decoded: 20000108fcf3d71d98590000010000000000000000000000ae91e1ee25252525

Solution

It turns out that my problem didn't have much to do with the difference between C and Python types, but rather some simple programming mistakes.

def encodeData(buf):
    newBuf = bytearray(buf)
    bufLen = unpack('H', buf[:2])
    entropy = unpack('I', buf[8:12])
    xorKey = 9 * (entropy[0] ^ ((entropy[0] ^ 0x3D0000) >> 16))
    baseByteTableIndex = (60205 * (xorKey ^ (xorKey >> 4)) ^ (668265261 * (xorKey ^ (xorKey >> 4)) >> 15)) & 0x7FFF;
    #Skip first 24 bytes, since that is header data
    for i in range(24,bufLen[0]):
        padIndex = (i + baseByteTableIndex) & 2047
        newBuf[i] ^= unpack('B',xorPad[padIndex])[0]
    return str(newBuf)

Thanks everyone for your help!

Upvotes: 3

Views: 129

Answers (2)

Peter Gibson
Peter Gibson

Reputation: 19554

Python integers don't overflow - they are automatically promoted to arbitrary precision when they exceed sys.maxint (or -sys.maxint-1).

>>> sys.maxint
9223372036854775807
>>> sys.maxint + 1
9223372036854775808L

Using array and/or unpack does not seem to make a difference (as you discovered)

>>> array('H', [1])[0] + sys.maxint
9223372036854775808L
>>> unpack('H', '\x01\x00')[0] + sys.maxint
9223372036854775808L

To truncate your numbers, you'll have to simulate overflow by manually ANDing with an appropriate bitmask whenever you're increasing the size of the variable.

Upvotes: 1

samgak
samgak

Reputation: 24417

This line of C:

unsigned long entropy = *((unsigned long *)buf + 2);

should translate to

entropy = unpack('I', buf[8:12])

because buf is cast to an unsigned long first before adding 2 to the address, which adds the size of 2 unsigned longs to it, not 2 bytes (assuming an unsigned long is 4 bytes in size).

Also:

newBuf[i] = xorPad[(i + baseByteTableIndex) & 2047]

should be

newBuf[i] ^= xorPad[(i + baseByteTableIndex) & 2047]

to match the C, otherwise the output isn't actually based on the contents of the buffer.

Upvotes: 2

Related Questions