Reputation: 35
I understand that retrieval of hash keys is considered O(1), The key in turn can point us to the actual data.
What I fail to understand is that for any number of possible hash values, how do we generally store the hash values. It looks to me that this should itself be a data structure allowing sparse values supporting key value pair, e.g a std::map and can not be accomplished with std:vector.
What I am trying to say here is that if you have a hasCode of 32 bit , you cant possibly keep an array of that size from the beginning when nearly all rows point to NULL as there may not be enough data.
Upvotes: 1
Views: 1120
Reputation: 727087
Hash keys are usually stored in an array or some other structure that supports O(1) random access to its element. The size of the structure increases as you add more elements to the hash table. When this happens, each key-value pair usually gets re-hashed.
In order to store hash keys from a wide range in an array with a relatively narrow range of hash buckets, modulo operator %
and a collision resolution strategy is used. In order to reduce the number of collisions, the number of buckets is set to a prime number. This reduces the possibility that hashCode % bucketCount
steers hash codes to buckets unevenly.
Upvotes: 3
Reputation: 95639
A hash table structure is an array structure, but no, it does not have that many entries; instead, one uses the modulus operation with the size of the array to determine the placement. This modulus operation (aside from the limited size of the hash and the fact that multiple values could produce identical hashes) is one of the reasons that collisions must be handled. Different types of hash maps handle collisions differently; popular solutions include having a linked list at each entry to which one appends (the "chained" method) or performing a secondary hash and walking the array to find an open slot (the "open addressing" method); other strategies exist as indicated in dasblinkenlight's link.
Note that std::map
is NOT a hash map; it is a tree-based map structure, and lookup is O(log n). However, std::unordered_map
is a hash map structure for which lookup is O(1).
Upvotes: 2