jpountz
jpountz

Reputation: 9964

hashCode implementation strategies

Some major JVM classes (such as String or List implementations) implement equals by returning Σ 31^n * field_n.hashCode() for every field_n that is relevant for the equals method. Beside, this approach is recommended by Joshua Bloch in Effective Java (item 9).

However, other classes such as Map.Entry implementations follow different rules. For example, the Map.Entry documentation states that the hash code of a Map.Entry should be

 (e.getKey()==null   ? 0 : e.getKey().hashCode()) ^
 (e.getValue()==null ? 0 : e.getValue().hashCode())

This can sometimes be impractical to use in hash tables, since:

Why did Java choose this implementation specification for Map.Entry hashCode instead of, for example, 31 * (e.getKey()==null ? 0 : e.getKey().hashCode()) + (e.getValue()==null ? 0 : e.getValue().hashCode())?

Edit 1:

To help figure out the problem, here is an example of useful code where the result has very poor performance due to hash collisions if many entries have the same key and value.

This method computes the frequencies of the entries of different maps (using Guava's Multiset).

public static <K, V> Multiset<Map.Entry<K, V>> computeEntryCounts(
        Iterable<Map<K, V>> maps) {
    ImmutableMultiset.Builder<Map.Entry<K, V>> result = ImmutableMultiset.builder();
    for (Map<K, V> map : maps) {
        for (Map.Entry<K, V> entry : map.entrySet()) {
            result.add(entry);
        }
    }
    return result.build();
}

Upvotes: 5

Views: 1106

Answers (2)

ruakh
ruakh

Reputation: 183201

I doubt there's a good reason — I think it's just an oversight — but it's not a serious problem. It would only come up if you had a HashSet<Map.Entry<T,T>> or a HashMap<Map.Entry<T,T>,V>, which is not commonly done. (Edited to add: Or, as Joachim Sauer points out below, a HashSet<Map<T,T>> or a HashMap<Map<T,T>,V> — also not commonly done.)

Note that HashMap<K,V> does not use Map.Entry<K,V>.hashCode(), because it looks up entries by their keys only, so it just uses K.hashCode().

Upvotes: 4

stefan bachert
stefan bachert

Reputation: 9608

My personal guess is, hashCode should be fast too.

Since you are allowed to override hashCode there is no problem with it. Change it when you know a better fitting algorithm for your case

Upvotes: 0

Related Questions