Reputation: 55

Python CRC32 Reverse Hash e8b7be43 to string "a"

I am trying to reverse a hash into a sequence of bytes. The function is shown below.

def decrypt(hash, size):
    bytes = bin(int(hash, 16)).replace("0b", "").zfill(32)
    bytes = flip(reverse(int(bytes, 2), 32), 33)
    div = "100000100110000010001110110110111"
    bytes = bytes.zfill(size*8)
    div = div.zfill(size*8)
    res = ""
    print(bytes + " - DIV")
    print(div + " - POLY")
    while div[0] != "1":
        res = ""
        for b in range(len(bytes)):
            if div[b] == bytes[b]:
                res = res + "0"
            else:
                res = res + "1"
        bytes = bin(int(res, 2) << 1).replace("0b", "").zfill(size*8)
        div = bin(int(div, 2) << 1).replace("0b", "").zfill(size*8)
        print(bytes + " - DIV")
        print(div + " - POLY")
    print(bytes)
    bytes = flip(bytes[0:8], 8) + flip(bytes[8:16], 8) + flip(bytes[16:24], 8) + flip(bytes[24:32], 8) + bytes[32:len(bytes)]
    bytes = bytes[0:len(bytes)-32]
    bytes = [int(reverse(int(bytes[(c*8):(c+1)*8], 2), 8), 2) for c in range(int(len(bytes)/8))]
    print(hex(zlib.crc32(bytearray(bytes))))
    return bytes
print(decrypt(sys.argv[1], int(sys.argv[2])+4))

And here is the output.

>>>python encrypt.py a
40
0xe8b7be43
[97]

>>>python decrypt.py e8b7be43 1
0000000000111101100000100001001011101000 - DIV
0000000100000100110000010001110110110111 - POLY
0000001001110010100001100001111010111110 - DIV
0000001000001001100000100011101101101110 - POLY
0000000011110110000010000100101110100000 - DIV
0000010000010011000001000111011011011100 - POLY
0000100111001010000110000111101011111000 - DIV
0000100000100110000010001110110110111000 - POLY
0000001111011000001000010010111010000000 - DIV
0001000001001100000100011101101101110000 - POLY
0010011100101000011000011110101111100000 - DIV
0010000010011000001000111011011011100000 - POLY
0000111101100000100001001011101000000000 - DIV
0100000100110000010001110110110111000000 - POLY
1001110010100001100001111010111110000000 - DIV
1000001001100000100011101101101110000000 - POLY
1001110010100001100001111010111110000000
0xa0058808
[198]

The issue here is that the hash of the bytearray that the "decrypt" function generates does not match the input hash e8b7be43. How do I fix this?

Upvotes: -1

Answers (2)

Mark Adler

Reputation: 112502

Here is an example in C of reversing a CRC-32 value to produce the four-byte message and any shorter messages that have that CRC:

// Reverse a CRC-32 into the unique four-byte and any shorter sequences that
// have that CRC. This is implemented for the standard ISO-HDLC / zlib CRC-32.
//
// Placed into the public domain by Mark Adler, 19 Dec 2024.

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>

#define POLYROT 0xdb710641      // Reflected CRC polynomial rotated left one

// Get a CRC-32 from the command line and show the four-byte plus any shorter
// sequences that give that CRC-32. The CRC can be provided in hexadecimal by
// preceding it with "0x". The results are shown in hexadecimal.
int main(int argc, char **argv) {
    if (argc != 2) {
        fputs("Usage: crcseqs 0x352441c2\n", stderr);
        return 1;
    }
    uint32_t crc = strtoul(argv[1], NULL, 0);
    if (crc == 0)
        // The zero-byte sequence has a CRC-32 of zero.
        puts("<empty sequence>");
    crc = ~crc;
    for (int n = 1; n <= 4; n++) {
        // See if there is an n-byte sequence with that CRC-32.
        for (int k = 0; k < 8; k++)
            crc = crc & 0x80000000 ? (crc << 1) ^ POLYROT : crc << 1;
        if (n == 4 || (~crc >> (n << 3)) == 0)
            // Yes, there is.
            for (int i = 0; i < n; i++)
                printf("%02x%c", (~crc >> (i << 3)) & 0xff,
                       i < n - 1 ? ' ' : '\n');
    }
    return 0;
}

The statement in the inner for loop runs the CRC backwards over one zero bit. It simply undoes what this corresponding statement does for a usual reflected CRC, where for the standard CRC-32 used here, POLY is 0xedb88320:

    crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;

The inner for loop does that eight times to run the CRC backwards over one zero byte.

We will consider the n == 4 case, where that loop has been run four times to process four zero bytes backwards, determining the initial CRC required to get the given CRC when run over a message of four zero bytes.

CRCs are linear over GF(2). That means that if we take two CRC calculations, each with an initial CRC, the message bytes, and a final CRC, we can exclusive-or those together and get a new initial CRC, message bytes, and final CRC that represent a correct CRC calculation, without having to have done that calculation.

Consider what happens if we take the initial CRC that we calculated for n == 4, and then run a CRC calculation over that CRC as the message bytes. (For this to work for a reflected CRC, we need to process the CRC bytes in little-endian order.) The usual CRC calculation for one byte, let's call it octet, is:

    crc ^= octet;
    for (int k = 0; k < 8; k++)
        crc = crc & 1 ? (crc >> 1) ^ POLY : crc >> 1;

For the first byte, that ^= exclusive-ors the low byte of the CRC with the low byte of the CRC, giving zero in the low byte. Then the for loop will see eight zero bits, and so simply shift the CRC right eight bits. This repeats for the next three bytes, getting zero in the low byte each time, resulting in a final CRC of zero.

We now have an initial CRC that is what we calculated backwards given a four-zeros message, a four-byte message consisting of that CRC in little-endian order, and a final CRC of zero. We now exclusive-or those three with the other calculated message with the same initial CRC, a message of four zero bytes, and the desired final CRC. We get a new triplet with a zero initial CRC, a four-byte message which is the computed initial CRC in little-endian order, and the desired final CRC, since we exclusive-or'ed that one with zero.

And there we have it! Those four bytes with an initial CRC of zero give the desired final CRC that was provided on the command line.

All of the explanation above assumes that the CRC definition has an initial value of zero and final exclusive-or of zero, i.e. no exclusive-or. However the standard CRC-32 here has an initial value of 0xffffffff and a final exclusive-or of 0xffffffff. The ~'s in the code take care of that.

I will leave it to the reader as an exercise to determine how this works for n == 1, 2, and 3.

Upvotes: 3

J_H

Reputation: 20550

Please understand that "to hash" is quite different from "to encrypt". Yes, one can build a cipher from secure hashes. But that's not what is going on here, and crc32 is in no way a secure hash, quite aside from the fact that a security parameter of 32 bits is easily bruted.

A hash is a one-way function, which might have several competing design considerations such as "small", "fast", "cryptographically secure". It reduces a large number of bits down to a fixed number such as 512 or 32.

A symmetric cipher turns N bits of cleartext into ~ N bits of ciphertext, with the possibility to go back and forth, via decrypt and encrypt routines. This is not at all like a hash that turns a million bits of input into a small fixed size output.

How do I fix this?

Read Crypto Eng. or a similar reference, and rethink your approach to the business problem. You want to bring relevant cryptographic primitives to bear upon it.

Upvotes: -2

Python CRC32 Reverse Hash e8b7be43 to string &quot;a&quot;

Answers (2)

Related Questions

Python CRC32 Reverse Hash e8b7be43 to string "a"