Tom Barclay
Tom Barclay

Reputation: 1

use of CRC32 over short 16byte messages

I want to use a CRC32 on a short message (16 bytes) to use as a unique key for another purpose. Will the CRC32 value be guaranteed to be unique for every possible 16 byte message ? Given that there can be 2^128 messages (or over 3.40 × 10^38 if you prefer) its not testable on a spreadsheet.

Upvotes: 0

Views: 567

Answers (2)

rcgldr
rcgldr

Reputation: 28826

You don't need to test all 2^128 possible patterns. For 128 data bits, the best 32 bit CRC polynomials can do is detect up to 7 bit errors.

https://users.ece.cmu.edu/~koopman/crc/crc32.html

In general, these 7 bit error detecting CRCs can only correct up to 3 bit errors, so there are some 4 bit patterns that will produce duplicates. A test program only needs to test comb(128,4) = 10,668,000 patterns to search for duplicates between 4 bit patterns. A 4 bit pattern can't duplicate a 1, 2, or 3 bit pattern, because worst case that would be a total of 7 error bits, which these CRCs are guaranteed to detect.

I tested using CRC polynomial 0x1f1922815, with all zero bits except for the pattern to be tested and the first collision found was between these two 4 bit patterns:

crc(00 00 00 00 00 00 40 00 00 00 00 00 11 00 08 00) = 87be3

crc(40 00 00 00 04 00 00 08 00 00 00 00 00 00 04 00) = 87be3

Upvotes: 1

Alex Guteniev
Alex Guteniev

Reputation: 13679

No. CRC32 are 32 bits, so there's no way to map 128 bits uniquely to 32 bits.

Checksum functions, like CRC32, are useful to detect common corruption (like one byte changed), not for unique identification.

Close kind of functions is hash functions that try to give unique value for every input, but they don't guarantee that there are no collisions, only try to minimize their probability for typical input distributions.

There's also "perfect hash function" that is guaranteed to be unique, but it limits input set to hash function output range and is typically implemented as lookup table.

Upvotes: 1

Related Questions