Shanky
Shanky

Reputation: 23

Hashmap - Load Factor equal to capacity

I was asked a very nice question in an interview . The question was -

We know that rehashing takes place if hashmap entries reach loadfactor . For example - If my loadfactor is 7 and my mapsize is 9 then rehashing will take place when no. of entries reach 7 .

Now the question is suppose I want in this map to enter only 8 elements ( and as we know I have kept loadfactor as 7) then when i have entered 7th element rehashing will take place. So I just wanted to add one more element after the 7th element but rehashing took place and size was increased unnecessarily and caused unnecessary memory wastage . So why didn't I keep loadfactor size as equal to initial capacity . Why it is kept to 75% or 80% or something like that .

Thanks in advance for your answers .

Upvotes: 1

Views: 246

Answers (2)

lulyon
lulyon

Reputation: 7265

" So why didn't I keep load factor size as equal to initial capacity . Why it is kept to 75% or 80% or something like that ."

The hashmap code could not predict how many elements you will insert into the map. What it is try to do is to keep the memory realloc operation less often. Expanding when exceeding load factor is a simple(and effetive) strategy, which keep the time complexity low, that is O(log(n)) times of realloc in worst case.

Update

The load factor is an indicator of the possibility the newly inserted element collided with the already loaded element, that is, the more your hashmap loaded, it is more likely the new element collide, means the hashkey of elements are equal, which will lead to longer liked list in the target buckets, thus reduce efficiency. The load factor try to prevent the hashmap from degeneration to several long linked list, which is not o(1) for insert or delete operation.

Upvotes: 1

Sean Owen
Sean Owen

Reputation: 66891

Load factor is not 7 in this case but 7/9. You're asking why it is less than 1 because this means some memory wasted on empty buckets. The answer is that it speeds up insertion and deletion by increasing the likelihood of even distribution of items across buckets. With load factor less than 1 and a good hash function you usually have just 1 item per bucket - low or no collisions.

Upvotes: 1

Related Questions