Reputation: 412
I am currently going through learn python the hard way, and i find my self stuck on this dictionary exercise: http://learnpythonthehardway.org/book/ex39.html
What gives me problems is the following code:
def hash_key(aMap, key):
"""Given a key this will create a number and then convert it to
an index for the aMap's buckets."""
return hash(key) % len(aMap)
How can i be sure that i wont get duplicate values from the hash_key function? Since modulo is being used, what prevents hash() from returning values such that after modulo is used on them they return the same hash_key.
Ex. len(aMap)=10
, hash(key1) = 20
, hash(key2) = 30
, therefore hash_key
for both dict keys is 0, even though they are obviously not equal.
I'm having trouble grasping the concepts behind hashing, so if you have any reading material suitable for my skill level, please share. I'm not afraid to work hard.
Thank you for your help.
Upvotes: 1
Views: 1171
Reputation: 5210
The hashmap, as proposed in the linked exercise, intends to generate key-collisions.
The data structure is a list of lists, where the key's hash-modulus value determines the 2nd-level list where your data goes.
Imagine the structure as an array of n
buckets, each with its own key
. If you put something into this datastructure, the hash_key()
method finds the appropriate bucket and appends your new data to that bucket's contents. Actually, it is random what bucket will receive your data, but since it's a hash function, it will always be the same bucket for the same key.
Upvotes: 1