user96454
user96454

Reputation: 133

Java default string hash function produces collisions on single character strings

I am a CS student, so please bear with me if what I say sounds too ridiculous. It definitely does so to me, that is why I am here in search of an answer. I read how strings are hashed in Java, and then I took a look at the ASCII table. The letters "d" and "n" hash to 100 and 110 respectively. Now if I were to create a brand new hashmap in Java, by default it has 10 buckets. So even though the hashcodes are unique, mod 10 they are both 0. This leads to a collision. Having collision on 1 character strings just doesn't sit well with me, so is the process I described correct? Thanks in advance.

Upvotes: 2

Views: 403

Answers (2)

rtomchinsky
rtomchinsky

Reputation: 61

What you described is probably correct, both would fall on the same bucket due to the pigeonhole principle, which basically means that if you have more items than holes to put them on, two or more will end up on the same hole. In this case, considering only the 95 printable ASCII characters, the principle states that there would be at least 10 in each hole (not considering the actual values, only the amount of them).

However, shazin's answer is also correct in that the hash values are not actually used as the identity for the values in a map, instead they are used to find the bucket in which the kay/value pair belongs, and then values in the bucket are checked for equality with their equals() method (or with ==, if using IdentityHashMap.)

Upvotes: 5

shazin
shazin

Reputation: 21923

Hash is actually used in Hash based collections as an index or grouping mechanism. Hash is not used as an actual reference. Hash used to find the bucket in which the element may contain first.

As you said d and n can be contained in the same bucket but after that the actual value in this case d and n can be used to refer the actual object. Since Hashmaps don't allow duplicate keys you can be sure that there will always be one d and one n.

Upvotes: 3

Related Questions