Reputation: 3777
I would like to use natural numbers (actually record IDs) as unique identifiers, but I'm not sure if there are two natural numbers that satisfy the following condition:
md5(number1) === md5(number2)
Is the previous condition possible?
Upvotes: 0
Views: 270
Reputation: 262
TL;DR: Yes, because there are finite different hashes and infinite numbers, then use pigeonhole principle.
MD5 produces 128-bit digest, therefore there are 2^128 different hash values.
Now, suppose you have a set of 2^128 different natural numbers.
If you calculate, one by one, the hash for each number in this set, you will have two possibilities while calculating them:
that you get a hash that you have already calculated for some previous number and therefore you have found a collision (i.e., your condition is met).
That you arrive at the last natural number and you have not obtained any repeated hash, i.e., each number in the set has obtained a different hash and therefore you have exactly 2^128 different hash values. After this, imagine you calculate the hash for a natural number that is not in the set, since there are no more distinct hashes, then this number will have to have the same hash as some number in the set and therefore again you have a collision (this is called the pigeonhole principle).
As you see, in any case, you always have a collision. More generally, for any hashing algorithm that produces a digest of N bits, it is enough to have 2^N + 1 distinct input elements to generate a collision.
Upvotes: 0
Reputation: 3777
The answer is "yes", because MD5 produces a 128-bit hash value (fixed length): https://en.wikipedia.org/wiki/MD5
So there are two numbers that produces the same md5 code, since there are infinite numbers.
Upvotes: 1