Reputation: 1164
The following demonstration was taken from the Algorithms: Design and Analysis video lecture on Coursera provided by Tim Roughgarden (phenomenal explanations btw). From what I understand, this hash function effectively returns an existing index position between 0 to n in the hash table. However, I am confounded by how this unique computation can be reverse engineered for constant time look-ups when a
(below) is a randomized integer.
Let U = IP addresses (of the form (x1, x2, x3, x4)
with each subscript of x in {0,1,2,...255})
Let n = a prime number
Define one hash function per 4-tuple a = (a1, a2, a3, a4)
with each subscript of a in {0, 1, 2,...n-1}
Define: h<sub>a : IP addresses ---> buckets *n^4 such functions*
by h<sub>a(x1,x2,x3,x4) = (a1x1 + a2x2 + a3x3 + a4x4)
<sub> = subscript notation
The above equation is then computed with modulo n
to return a position within the hash table that has a 1/n probability chance of recurrence for any different inputs within U
. How do I retrieve that IP address location with the original IP address?
Upvotes: 3
Views: 623
Reputation: 241811
Hashes lose information. So you cannot work backwards from the hash to the original. You store the original at the index given by its hash; you can verify that some value x is in the table by seeing whether the item at H(x) is x.
Of course, that won't work if two objects in the table have the same hash (a "collision"). Different hash-table algorithms have different strategies to deal with collisions, and for the strategy your lecturer is probably explaining, it will be useful to have another independent hash function.
Upvotes: 1