Reputation: 328
The size of hashmap is increased based on the load factor, but how is it decided that a hashmap has become full, since each hashmap bucket can contain a huge number of entries. Why does it create new buckets instead of adding entries to existing buckets? And is there any way to determine the number of entries in a bucket? How does load factor come into play ? Let's say that the initial capacity is 16 and the load factor is .75, so after storing how many entries the hashmap will be rehashed ? Because 16*0.75 is 12, so will it be rehashed after we store 12 entries, or will it be rehashed after 12 buckets have entries and rest of the buckets are empty? What exactly does this 12 represent ?
Upvotes: 0
Views: 408
Reputation: 88
how is it decided that a hashmap has become full?
Theoretically It can not be said as full. Still we can add any number of objects to the bucket's linked list (as you said).
Assume so many (key,value) entries are adding to HashMap. Ideally HashMap functions (get,put,contains) has to work in O(1). To achieve it, each bucket of HashMap has to store only one (key,value) pair. Whenever hash map gets a collision, it has to reorganize it's underlying data structures to facilitate ideal hashing. Reorganizing the internal data structures for every collision is complex operation and it degrades the performance of hash map.
So it is decided that, some collisions will be tolerated. Rehashing of a hash map is done when the number of elements in the map reaches the maximum threshold value. At rehashing time, underlying buckets will become double and (key,value) will be mapped to these new set of buckets.
By doing so, number of (key,value) pairs in buckets will generally come down. Thus by using extra space, hash map works better.
is there any way to determine the number of entries in a bucket?
In HashMap, we can not know number of entries in each bucket. Even if we know that we can not split only that bucket. For example, there are 16 buckets currently in HashMap and if we know that many (Key,Value) pairs are mapped to a bucket, we can not simply split that bucket into 2. We can not explicitly create a new bucket to share the load of that bucket. In both of these cases, buckets count should become 17.But HashMap should always have number of buckets as 2 to the power of n. So we can not do anything special about knowing the number of entries in a bucket. So in HashMap, bucket level decisions will not be done. Global decision will be done based on number of entries are in HashMap.
Upvotes: 1
Reputation: 972
Because if we add entry to existing buckets these entries will be stored in the list format in bucket and it will increase the get method running time. Because list takes O(n) to compare an element.
Upvotes: 0
Reputation: 182779
The fullness of a hashmap is judged by the number of entries relative to the number of buckets. Typically, these two values are just stored as data members in the hashmap and updated as they change.
Adding entries to existing buckets makes the hashmap slower, especially when searching for entries that are not present since you have to look at every entry in the bucket the desired value would be in if it were present.
Imagine if you were storing records indexed by name. If you had only 300 records, indexing them by the first letter of the name might be fine. Worst case, if you're looking for someone whose name starts with "S" you might have to look through 20 records. But if you had 2,000 records, you probably want to index them by more than one letter, that is, using more than just 26 buckets.
In practice, this is exactly what people do when they have larger numbers of records to index. They might, for example, add a bucket by splitting the "S" bucket into "Sa-Sk" and "Sl-Sz".
Upvotes: 0