Reputation: 1
I want to use a CRC32 on a short message (16 bytes) to use as a unique key for another purpose. Will the CRC32 value be guaranteed to be unique for every possible 16 byte message ? Given that there can be 2^128 messages (or over 3.40 × 10^38 if you prefer) its not testable on a spreadsheet.
Upvotes: 0
Views: 567
Reputation: 28826
You don't need to test all 2^128 possible patterns. For 128 data bits, the best 32 bit CRC polynomials can do is detect up to 7 bit errors.
https://users.ece.cmu.edu/~koopman/crc/crc32.html
In general, these 7 bit error detecting CRCs can only correct up to 3 bit errors, so there are some 4 bit patterns that will produce duplicates. A test program only needs to test comb(128,4) = 10,668,000 patterns to search for duplicates between 4 bit patterns. A 4 bit pattern can't duplicate a 1, 2, or 3 bit pattern, because worst case that would be a total of 7 error bits, which these CRCs are guaranteed to detect.
I tested using CRC polynomial 0x1f1922815, with all zero bits except for the pattern to be tested and the first collision found was between these two 4 bit patterns:
crc(00 00 00 00 00 00 40 00 00 00 00 00 11 00 08 00) = 87be3
crc(40 00 00 00 04 00 00 08 00 00 00 00 00 00 04 00) = 87be3
Upvotes: 1
Reputation: 13679
No. CRC32 are 32 bits, so there's no way to map 128 bits uniquely to 32 bits.
Checksum functions, like CRC32, are useful to detect common corruption (like one byte changed), not for unique identification.
Close kind of functions is hash functions that try to give unique value for every input, but they don't guarantee that there are no collisions, only try to minimize their probability for typical input distributions.
There's also "perfect hash function" that is guaranteed to be unique, but it limits input set to hash function output range and is typically implemented as lookup table.
Upvotes: 1