user2219907
user2219907

Reputation: 33

Calculating the probability of the UUID to be Unique

I don't have the possibility of using Boost, hence I wrote a rather simple code to generate a UUID. However, I don't know how to calculate how unique a UUID generated by this code is! Can someone please explain how can I calculate the probability of the UUID, as generated by the following code, to be unique?

string UUID;
uint8_t uuidLength = 32;
uint8_t asciiSets[][2]={
     {48, 10},     //Numbers 0-1
     {65, 26},     //Alphabet a-z
     {97, 26}      //Alphabet A-Z
};
UUID.reserve(uuidLength);

uint8_t counter;
srand(time(NULL));

for(counter = 0;counter<uuidLength;counter++)
{
    uint8_t set = rand()%3;//Choose one of the three sets of ascii ranges
    uint8_t start = asciiSets[set][0];
    uint8_t range = asciiSets[set][1];
    UUID[counter] = rand()%range+start;
}

Upvotes: 0

Views: 558

Answers (2)

Anis
Anis

Reputation: 3094

The probabilities associated with each character is 1/30 for digits and 1/(3*26) for letters. That makes an entropy of 4.038 bits compared to 4.13 if the distribution was uniformed. This amounts to H=32*4.038=129.22 bits of entropy total. This is roughly equivalent to sampling for a uniform distribution with 2^129.22 possibilities

If you look for the probability of collisions, it will depend on the number of uuid you have generated obviously. You should have a look at https://en.wikipedia.org/wiki/Birthday_attack as suggested. As you will see, the probability of having a collision with n samples is given, you can apply this formula to 2^H with H given above.

The code to compute entropy:

import numpy as np
probs = [1/30.]*10 + [1/(3*26.)]*52
ent = -np.sum(probs*np.log(probs))

Upvotes: 2

MSalters
MSalters

Reputation: 179779

The first thing to observe is that this is very unlikely to generate a UUID. A UUID contains 32 hexadecimal digits, and only 22 out of 62 characters that you generate are hexadecimal digits. That's about 1 in 3 per digit, so the chances are 1 in 250 trillion that you will actually generate a 32 digit hexadecimal number.

Just use Boost.

Upvotes: 2

Related Questions