mcw
mcw

Reputation: 50

Inevitable Collisions When Hashing?

If I create a new Map:

Map<Integer,String> map = new HashMap<Integer,String>();

Then I call map.put() a bunch of times each with a unique key, say, a million times, will there ever be a collision or does java's hashing algorithm guarantee no collisions if the key is unique?

Upvotes: 3

Views: 383

Answers (3)

Adrian Shum
Adrian Shum

Reputation: 40036

There are two things you need to know:

  1. Even there is collision, it is not going to cause problem, because for each bucket, there is a list. In case you are putting to a bucket that already have value inside, it will simply append at the list. When retrieval, it will first find out which bucket to lookup, and from the bucket, go through each value in the list and find out the one that is equals (by calling equals())

  2. If you are putting millions of value in the Hashmap, you may wonder, then every linked list in the map will contains thousands of values. Then we are always doing big linear search which will be slow. Then you need to know that, Java's HashMap is going to be resized whenever number of entries are larger than certain threshold (have a look in capacity and loadFactor in Javadoc). With a properly implemented hash code, number of entries in each bucket is going to be small.

Upvotes: 1

Denise
Denise

Reputation: 2005

Hashing does not guarantee that there will be no collisions if the key is unique. In fact, the only thing that's required is that objects that are equal have the same hashcode. The number of collisions determines how efficient retrieval will be (fewer collisions, closer to O(1), more collisions, closer to O(n)).

What an object's hashcode will be depends on what type it is. For instance, a string's default hashcode is

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]

which necessarily simplifies down the complexity of the string to a single number -- definitely possible to reach the same hashcode with two different strings, though it'll be pretty rare.

If two things hash to the same thing, hashmap uses .equals to determine whether a particular key matches. That's why it's so important that you override both hashCode() and equals() together and ensure that things that are equal have the same hash code.

Upvotes: 3

Flying_Banana
Flying_Banana

Reputation: 2910

Hashtable works somewhat as follows:

  1. A hashmap is created with an initial capacity (or number of buckets)

  2. Each time you add an object to it, java invokes the hash function of the key, a number, then modulo this to the current size of the hashtable

  3. The object is stored in the bucket with the result from step 2.

So even if you have unique keys, they can still collide unless you have as many buckets as your range of hash of your key.

Upvotes: 1

Related Questions