Nazmul Hoque Shafin
Nazmul Hoque Shafin

Reputation: 369

How does single instance in each bucket yield best performance in java Hashmap?

I have read in a book that if a hash function returns unique hash value for each distinct object, its most efficient. If hashcode() method in a class gives a unique hash value for each distinct object and I want to store n distinct instance of that class in a Hashmap, then there will be n buckets for storing n instances.The time complexity will be O(n). Then how does single entry(instance) for each hash value yield better performance?Is it related to the data structure of bucket?

Upvotes: 3

Views: 137

Answers (2)

JimR21
JimR21

Reputation: 21

A Java HashMap associates a Key k with a Value v. Every Java Object has a method hashCode() that produces an integer which is not necessarily unique.

I have read in a book that if a hash function returns unique hash value for each distinct object, its most efficient.

Another definition would be that the best Hash Function is one that produces the least collisions.

If hashcode() method in a class gives a unique hash value for each distinct object and I want to store n distinct instance of that class in a Hashmap, then there will be n buckets for storing n instances.The time complexity will be O(n).

In our case HashMap contains a table of buckets of a certain size, let's say >= n for our purposes. It uses the hashCode of the object as a key and through a Hash Function returns an index to the table. If we have n objects and the Hash Function returns n different indexes we have zero collisions. This is the optimal case and the complexity to find and get any object is O(1).

Now, if the Hash Function returns the same index for 2 different keys (objects) then we have a collision and the table bucket on that index already contains a value. In that case the table bucket will point to another newly allocated bucket. In that order, a list is created on the index that the collision happened. So the worst case complexity will be O(m) where m is the size of the biggest list.

In conclusion the performance of the HashMap depends on the number of collisions. The fewer the better.

I believe this video will help you.

Upvotes: 2

Eugene
Eugene

Reputation: 120848

You seem to think that having n buckets for n elements, time complexity will be O(n), which is wrong.

How about a different example, suppose you have an ArrayList with n elements, how much time it will take to perform a get(index) for example? O(1) right?

Now think about the HashMap, this index in the ArrayList example is actually the hashCode for the map. When we insert into a HashMap to find the location of where that element goes (the bucket), we use the hashcode (index). If there is an entry per bucket - the search time for a value from the map is O(1).

But even if there are multiple values in a single bucket, the general search complexity for a HashMap is still O(1).

The data structure of the bucket is important too. For the worse case scenarios for example. Under the current implementation of a HashMap it uses two types: LinkedNode and TreeNode; depending on a few things like how many there are in a bucket at this point in time. Linked is easy:

next.next.next...

TreeNode is

    - left
node   
    - right

It is a red-black Tree. In such a data structure search complexity is O(logn) which is much better then O(n).

Upvotes: 2

Related Questions