Reputation: 3385
I'm using the UUID library in Python to generate unique ID's for object. More specifically, I'm basically doing
id_for_something = uuid.uuid4().hex
My question is, do I have to use the entire hexadecimal string value to guarantee that the ID's will be unique? Or is it okay to use, for example, the first 4 digits? I'm simply asking because using the entire string seems a bit long. Thanks!
Upvotes: 0
Views: 828
Reputation: 56467
UUIDs are never guaranteed to be unique. For example if you generate 2.7 * 10^18 UUID4 then you have 50% chance of generating a collision (see wiki). But this number is huge, so we rarely care about it. This is of course under the assumption that the underlying random generator is good enough.
However if you shorten UUID, then you substantially increase that probability. For UUID of length 4 bytes (lets call it SHORTUUID) you have 16^4 combinations which implies (due to birthday paradox) that after generating ~65k such SHORTUUIDs you will have over 50% probability of collision (see this and note that 4 byte UUID is the same as 32 bit integer). That number is low, like pathetically low. And in reality it gets worse, since not entire UUID4 is random.
So if you care about collisions then don't do that.
If you want to shorten UUIDs then I advice using base64 encoding instead of hex.
Upvotes: 5