Runpeng Chen
Runpeng Chen

Reputation: 105

Why does the Java HashMap implementation use transfer() instead of put() when resizing?

In the Java 7 HashMap implementation, in the resize method, it calls transfer which moves old elements to a new table. Why have they written a new method instead of calling put with all the old elements? The resize won't be triggered again due to threshold. Calling put makes the code clearer.

/**
 * Transfers all entries from current table to newTable.
 */
void transfer(Entry[] newTable) {
    Entry[] src = table;
    int newCapacity = newTable.length;
    for (int j = 0; j < src.length; j++) {
        Entry<K,V> e = src[j];
        if (e != null) {
            src[j] = null;
            do {
                Entry<K,V> next = e.next;
                int i = indexFor(e.hash, newCapacity);
                e.next = newTable[i];
                newTable[i] = e;
                e = next;
            } while (e != null);
        }
    }
}

Upvotes: 4

Views: 104

Answers (1)

Joachim Sauer
Joachim Sauer

Reputation: 308081

One important difference is that transfer can make use of the fact that Entry objects already exist for each entry in two ways:

  • it can reuse the Entry objects themselves to avoid having to allocate new ones (thus avoiding memory allocation and therefore reducing GC pressure).
  • it can reuse the hash value stored in the Entry object thus avoiding having to call Object.hashValue on each key that's already in the map (which could theoretically be an expensive operation).

Basically: if resize was implemented simply in terms of put it would have to re-do a lot of work that can easily be avoided.

Later versions of the JDK have significantly more complex HashMap implementations, where even more complex transfer methods (or equivalent) are implemented.

It's also worth pointing out that even minor performance gains at the cost of less "simple" code are often worth it when done in the JDK itself: since HashMap is used in basically every Java program out there, having it be as performant as possible at the cost of slightly less readable cost is usually worthwhile trade-off. That same reasoning does not apply equally to most other software that us "mere mortals" write.

Upvotes: 5

Related Questions