Geek
Geek

Reputation: 27219

Why would a higher load factor in HashMap increase lookup cost?

From JavaDoc of HashMap :

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put).

If we have a higher value ,why would it increase the lookup cost ?

Upvotes: 7

Views: 3462

Answers (5)

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 727047

Hash table's Load Factor is defined as

n/s, the ratio of the number of stored entries n and the size s of the table's array of buckets.

High performance of hash table is maintained when the number of collisions is low. When the load factor is high, the number of hash buckets needed to store the same number of entries remains lower, thus increasing the probability of collisions.

Upvotes: 6

Vaibhav Desai
Vaibhav Desai

Reputation: 2402

The load factor 0.75 can be interpreted this way using the formula(n/s, the ratio of the number of stored entries n and the size s of the table's array of buckets.):

Suppose you have 75 values that you need to store in hash table and you have 100 empty array blocks to store them in, here the chances of collision is minimized and load factor is 0.75.

Now suppose you have 75 values to store and only 10 empty array blocks(load factor 7.5) this means that you will have collision and employ any of collision resolution techniques which will increase your search time.

Now other way you have 75 entries and 1000 empty array blocks(load factor 0.075) this will lead to a lot empty blocks which is a lot waste of space.

Hence the thumb rule is as the value of load factor grows your search time will increase, and as it goes close to 0 more storage space is wasted.

Hence it is a space time tradeof.

Upvotes: 0

Vivek Gupta
Vivek Gupta

Reputation: 2814

Here we should first understand what capacity and load factor means:

capacity : this is number of buckets in any hash table at any given point in time.

load factor : The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased

so more the load factor is more occupied a hash table could get before the capacity is increased.

  • now given the best possible implementation of hashCode() only one value will go in one bucket here lookup cost will be minimum
  • in worst case all values will go in same bucket and lookup cost would be maximum
  • in an average case also, this will surely depend on hashCode() implementation but one more factor that will play here is load factor, as more occupied the collection will be, the more will be chances of collision and thus higher load factor will increase lookup cost in a non ideal scenario.

Upvotes: 3

Mohammod Hossain
Mohammod Hossain

Reputation: 4114

default load factor (0.75).

If declare load factor as
1 the restructuring process only happens when number of elements are exceeding the capacity of the HashMap.

Upvotes: 0

aleroot
aleroot

Reputation: 72676

It has to do with how an HashTable is implemented under the hood, it uses hash codes and since the algorithm to calculate hash code is not perfect, you can have some collisions, increasing the load factor increase the probability to have collisions, and consequently reduce the lookup performance ...

Upvotes: 2

Related Questions