Friendly-Robot
Friendly-Robot

Reputation: 1164

How does the universal hash function that randomly generates an integer perform lookup after insertion?

The following demonstration was taken from the Algorithms: Design and Analysis video lecture on Coursera provided by Tim Roughgarden (phenomenal explanations btw). From what I understand, this hash function effectively returns an existing index position between 0 to n in the hash table. However, I am confounded by how this unique computation can be reverse engineered for constant time look-ups when a (below) is a randomized integer.

Let U = IP addresses (of the form (x1, x2, x3, x4)
                       with each subscript of x in  {0,1,2,...255})
Let n = a prime number
Define one hash function per 4-tuple a = (a1, a2, a3, a4)
                         with each subscript of a in {0, 1, 2,...n-1}
Define: h<sub>a : IP addresses ---> buckets        *n^4 such functions*
by h<sub>a(x1,x2,x3,x4) = (a1x1 + a2x2 + a3x3 + a4x4)

<sub> = subscript notation

The above equation is then computed with modulo n to return a position within the hash table that has a 1/n probability chance of recurrence for any different inputs within U. How do I retrieve that IP address location with the original IP address?

Upvotes: 3

Views: 623

Answers (1)

rici
rici

Reputation: 241811

Hashes lose information. So you cannot work backwards from the hash to the original. You store the original at the index given by its hash; you can verify that some value x is in the table by seeing whether the item at H(x) is x.

Of course, that won't work if two objects in the table have the same hash (a "collision"). Different hash-table algorithms have different strategies to deal with collisions, and for the strategy your lecturer is probably explaining, it will be useful to have another independent hash function.

Upvotes: 1

Related Questions