Gerald Thibault
Gerald Thibault

Reputation: 1093

will hash(time.time()) always be unique

I am trying to generate unique ID numbers for some unit tests, and I saw a suggestion somewhere to use something like:

def unique_id():
    time.sleep(0.000001) # smallest precision for time.time()
    return time.time()

I'm wondering if the hash() call will always take at least 0.000001, so I could use:

def unique_id():
    return hash(time.time())

Could that ever return the same value twice, if I'm calling it in succession within a single threaded application?

edit: bolded the word 'NUMBERS' because every single person has overlooked it.

Upvotes: 4

Views: 6376

Answers (3)

Evan Fosmark
Evan Fosmark

Reputation: 101761

If you need a unique value, it's recommended to use the uuid library. Example:

>>> import uuid
>>> uuid.uuid4()
UUID('514c2bd7-75a3-4541-9075-d66560f42b5c')
>>> str(uuid.uuid4())
'6faad714-c2df-448b-b072-f91deb380e84'

If you need number-only values, use the random library.

>>> import random
>>> INT_MAX = sys.maxint #  Set INT_MAX to the max value for your given INT column
>>> random.randint(0, INT_MAX)
5188925271790705047

Upvotes: 9

kindall
kindall

Reputation: 184345

This is trivial to answer from the Python prompt:

>>> import time; print hash(time.time()) == hash(time.time())
True

(If you see False, you merely got really lucky.)

So, yes. Modern computers are easily fast enough to hash a float in under 0.000001 seconds. In fact, when I wrote that as a while loop that incremented a counter, it appears that on my machine, Python can get the time and hash it more than 5000 times in a row without seeing a difference. Not surprising: the hash is used for fitting objects into a hash table (dictionary), so one of its primary requirements is speed.

In any case, there is no requirement or guarantee that hash() return a unique identifier for each object. Two distinct values of time.time() (or any type) could have the same hash, and there is nothing preventing these two values from being "adjacent" by some definition.

What you want, as others have pointed out, is a UUID. Don't reinvent the wheel. If you can't use a UUID, use something that can't ever be duplicated, such as a counter.

Upvotes: 2

Dragontamer5788
Dragontamer5788

Reputation: 1987

Evan Fosmark already covered it.

But I want to add that Python's "hash" function is only 32-bits or 64-bits as far as I can tell. I don't even know how its implemented, but I doubt it is cryptographically random. Collisions are expected from the low-quality hash function.

Upvotes: 1

Related Questions