tom
tom

Reputation: 73

Are the odds of a cryptographically secure random number generator generating the same uuid small enough that you do not need to check for uniqueness?

I'm using this with a length of 20 for uuid. Is it common practice to not check if the uuid generated has not been used already if it's used for a persistent unique value?

Or is it best practice to verify it's not already being used by some part of your application if it's essential to retain uniqueness.

Upvotes: 3

Views: 1425

Answers (2)

AutoBootDisk
AutoBootDisk

Reputation: 99

crypto.RandomBytes is safe enough for most applications. If you want it to by completely secure, use a length of 16. Once there is a length of 16 there will likely never be a collision in the nearest century. And it is definitely not a good idea to check an entire database for any duplicates, because the odds are so low that the performance debuff outweighs the security.

Upvotes: 1

r3mainer
r3mainer

Reputation: 24557

You can calculate the probability of a collision using this formula from Wikipedia::

     n(p;H) ≈ √{2H ln[1/(1-p)]}

where n(p; H) is the smallest number of samples you have to choose in order to find a collision with a probability of at least p, given H possible outputs with equal probability.

The same article also provides Python source code that you can use to calculate this value:

from math import log1p, sqrt

def birthday(probability_exponent, bits):
    probability = 10. ** probability_exponent
    outputs     =  2. ** bits
    return sqrt(2. * outputs * -log1p(-probability))

So if you're generating UUIDs with 20 bytes (160 bits) of random data, how sure can you be that there won't be any collisions? Let's suppose you want there to be a probability of less than one in a quintillion (10–18) that a collision will occur:

>>> birthday(-18,160)
1709679290002018.5

This means that after generating about 1.7 quadrillion UUIDs with 20 bytes of random data each, there is only a one in 1 a quintillion chance that two of these UUIDs will be the same.

Basically, 20 bytes is perfectly adequate.

Upvotes: 2

Related Questions