Ben Yitzhaki
Ben Yitzhaki

Reputation: 1426

how unique is a portion of md5?

I'm having a question regarding the uniqueness of md5 function.

I know that md5 (with microtime value) are not unique, however, they are pretty unique :)

How can I calculate the probability of a collision between two portions of an md5 hashes?

For example: The following in php that generates a 8 chars string from md5 result:

substr(md5(microtime()), 0, 8);

A second scenario - What if the index of it is unique (so it gets a different portion of the hash each time)?

substr(md5(microtime()), rand(0, 32), 8);

Upvotes: 1

Views: 1377

Answers (3)

Matt Timmermans
Matt Timmermans

Reputation: 59368

There are 2^32 combinations of 8 hexadecimal digits. Even if they are completely random, you can only generate about 65000 such strings, on average, before you get 2 that are the same.

md5(), using a random index or not, doesn't significantly change anything as long as all the microtime() values use use are unique. But, if you are generating these too fast, or across many machines, then the situation is much much worse, because there's a good chance you could end up using the same microtime() value twice.

Upvotes: 1

Gianluca Ghettini
Gianluca Ghettini

Reputation: 11678

It depends on how many "sub-hashes" you are going to generate and how many bits you're keeping from the original MD5 hash (length of a "sub-hash"). If you generate just 1 sub-hash and keep just 1 bit then no collision at all. If you generate 2 sub-hashes expect 50% collision. Use 2 bits and the odds are 25%. You do the math. Refer to the birthday paradox for more info

Upvotes: 0

Niklesh Raut
Niklesh Raut

Reputation: 34924

As you are asking about uniqueness of your string, it's actually a probability. Means as much string character you will use and as much the length of random string you make will get less chances of getting similar random string.

So, to get unique string you need to store string in your DB and compare with random string, if you found similar then again go for new fresh string , until you get unique string.

Upvotes: 0

Related Questions