Maxim Tkachenko
Maxim Tkachenko

Reputation: 5798

How does hashtable read correct values in case of collision?

I have some hashtable. For instance I have two entities like

john = { 1stname: jonh, 2ndname: johnson },
eric = { 1stname: eric, 2ndname: ericson }

Then I put them in hashtable:

ht["john"] = john;
ht["eric"] = eric;

Let's imagine there is a collision and hashtable use chaining to fix it. As a result there should be a linked list with these two entities like thisenter image description here How does hashtable understand what entity should be returned for key? Hash values are the same and it knows nothing about entities structure. For instance if I write thisvar val = ht["john"]; how does hashtable (having only key value and its hash) find out that value should be john record and not eric.

Upvotes: 15

Views: 2905

Answers (2)

I think what you are confused about is what is stored at each location in the hashtable's adjacent list. It seems like you assume that only the value is being stored. In fact, the data in each list node is a tuple (key, value).

Once you ask for ht['john'], the hashtable find the list associated with hash('john') and if the list is not empty it searches for the key 'john' in the list. If the key is found as the first element of the tuple then the value (second element of the tuple) is returned. If the key is not found, then it means that the element is not in the hashtable.

To summarize, the key hash is used to quickly identify the cell in which the element should be stored if present. Actual key equality is tested for to decide whether the key exists or not.

Upvotes: 25

a-man
a-man

Reputation: 707

Is this what you are asking for? I have already put this in comments but seems to me you did not follow link

Collision Resolution in the Hashtable Class

Recall that when inserting an item into or retrieving an item from a hash table, a collision can occur. When inserting an item, an open slot must be found. When retrieving an item, the actual item must be found if it is not in the expected location. Earlier we briefly examined two collusion resolution strategies:

  • Linear probing
  • Quardratic probing

The Hashtable class uses a different technique referred to as rehasing. (Some sources refer to rehashing as double hashing.)

Rehasing works as follows: there is a set of hash different functions, H1 ... Hn, and when inserting or retrieving an item from the hash table, initially the H1 hash function is used. If this leads to a collision, H2 is tried instead, and onwards up to Hn if needed. The previous section showed only one hash function, which is the initial hash function (H1). The other hash functions are very similar to this function, only differentiating by a multiplicative factor. In general, the hash function Hk is defined as:

Hk(key) = [GetHash(key) + k * (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1)))] % hashsize

Mathematical Note With rehasing it is important that each slot in the hash table is visited exactly once when hashsize number of probes are made. That is, for a given key you don't want Hi and Hj to hash to the same slot in the hash table. With the rehashing formula used by the Hashtable class, this property is maintained if the result of (1 + (((GetHash(key) >> 5) + 1) % (hashsize – 1))and hashsize are relatively prime. (Two numbers are relatively prime if they share no common factors.) These two numbers are guaranteed to be relatively prime if hashsize is a prime number. Rehasing provides better collision avoidance than either linear or quadratic probing.

sources here

Upvotes: 2

Related Questions