Reputation: 19562
It seems that Java's implementation of the HashMap always places the keys to the same bins (at least I saw that with Integer keys). I.e. the hashing is deterministic and in all runs it produces the same value.
I have heard that some languages randomize the insertions so that in which bucket a key will be stored is unpredictable for security reasons.
Why is Java the keys are always the same?
Upvotes: 3
Views: 1015
Reputation: 15273
To complete other answers, I would mention that when HashMap
's keys use built-in Object.hashCode
, i. e. don't override this method, that is a pretty often case, Object's hashCode is computed using random number generator, that makes whole system behaivour less deterministic. See this question: Java Object.hashCode() - address or random()? for more details.
Upvotes: 0
Reputation: 33936
This is not true for Java 7, which added a unique hash seed to each HashMap instance. There's more information on the Collections Framework Enhancements in Java SE 7 page.
This mechanism was removed in Java 8 for performance, and it was replaced by an alternative that converts comparable keys (such as String) into a balanced tree to elide the DoS security problem. There's more information on the Collections Framework Enhancements in Java SE 8 page.
Upvotes: 5
Reputation: 147154
The attack of interest here is Denial of Service (DoS). An adversary chooses a set of keys that hit the same bucket. This transforms the performance of map operations from O(1) to O(n). Do this n times (to construct the map say), and we go from O(n) to O(n^2). There's also the possibility of timing attacks, but I'll conveniently ignore that.
In general, most library code will assume no action is necessary to avoid DoS. However, recently some Java implementations have used MURMUR hashing to randomise the hash function for String
to avoid certain attacks. MURMUR mixes a per-process random number into the generation of the hash code such that the function is stable for the process, but is difficult (though not necessarily impossible) to figure out from the outside. More recently this has been replaced with falling back to a tree structure if there are excessive collisions and the key implements Comparable
appropriately.
If you're worried about such attacks in the situation you find your code, you can use other Map
implementations such as java.util.TreeMap
.
Upvotes: 5
Reputation: 156524
In Java, the individual class is responsible for implementing a default hashCode() method (or inheriting one from the Object
class), and complicated security scenarios like defeating advanced DoS attacks are not the responsibility of classes like Object
, Integer
, etc.
So most classes use a very simple, fast implementation that tries to ensure fairly even distribution in common cases.
If you have a case where you feel it's important to implement a custom hashing strategy, whether it's because you want to avoid hacking, or because you know your particular usage is likely to cause a lot of collisions with the default method, you can use a collection like Gnu Trove's THashMap
, which allows you to provide a custom hashing strategy specific to the collection instance.
Upvotes: 1