Reputation: 804
I have an md5 function which i have confirmed to work well for both files and strings. But when i use it on variable sized chunks of very large files it generates md5 values which are the same but the size of the chunks is different.
I wonder if there is a probability that two chunks with different lengths but may be with the same content result in similar md5 fingerprints.
Upvotes: 2
Views: 2568
Reputation: 525
You have no chance to have the same MD5 hash without try to do it.
Check here for more information about collision: http://www.mscs.dal.ca/~selinger/md5collision/
Upvotes: 2
Reputation: 33149
The odds that this happens is 1 / (2^128), since MD5 is a 128-bit hash. That means 1/(3.4 x 10^38), so it's very unlikely but not impossible.
It's more probable, I think, that you're doing something wrong and you are actually calculating the MD5 of the same text/file every time.
Upvotes: 6