Reputation: 33
I don't have the possibility of using Boost, hence I wrote a rather simple code to generate a UUID. However, I don't know how to calculate how unique a UUID generated by this code is! Can someone please explain how can I calculate the probability of the UUID, as generated by the following code, to be unique?
string UUID;
uint8_t uuidLength = 32;
uint8_t asciiSets[][2]={
{48, 10}, //Numbers 0-1
{65, 26}, //Alphabet a-z
{97, 26} //Alphabet A-Z
};
UUID.reserve(uuidLength);
uint8_t counter;
srand(time(NULL));
for(counter = 0;counter<uuidLength;counter++)
{
uint8_t set = rand()%3;//Choose one of the three sets of ascii ranges
uint8_t start = asciiSets[set][0];
uint8_t range = asciiSets[set][1];
UUID[counter] = rand()%range+start;
}
Upvotes: 0
Views: 558
Reputation: 3094
The probabilities associated with each character is 1/30 for digits and 1/(3*26) for letters. That makes an entropy of 4.038 bits compared to 4.13 if the distribution was uniformed. This amounts to H=32*4.038=129.22 bits of entropy total. This is roughly equivalent to sampling for a uniform distribution with 2^129.22 possibilities
If you look for the probability of collisions, it will depend on the number of uuid you have generated obviously. You should have a look at https://en.wikipedia.org/wiki/Birthday_attack as suggested. As you will see, the probability of having a collision with n samples is given, you can apply this formula to 2^H with H given above.
The code to compute entropy:
import numpy as np
probs = [1/30.]*10 + [1/(3*26.)]*52
ent = -np.sum(probs*np.log(probs))
Upvotes: 2
Reputation: 179779
The first thing to observe is that this is very unlikely to generate a UUID. A UUID contains 32 hexadecimal digits, and only 22 out of 62 characters that you generate are hexadecimal digits. That's about 1 in 3 per digit, so the chances are 1 in 250 trillion that you will actually generate a 32 digit hexadecimal number.
Just use Boost.
Upvotes: 2