Martin Geisse
Martin Geisse

Reputation: 1451

Why does String.intern() seem to return different instances for equal strings?

I have observed a behavior of String.intern() that I am trying to understand. It seems to contradict the documentation for that method.

private static String buildSampleString() {
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < 10; i++) {
        builder.append((char)(i + 'a'));
    }
    return builder.toString();
}

private static void performTest(String a) {
    String b = buildSampleString().intern();
    System.out.println("a vs. b: " + (a == b) + ", " + a.equals(b));
    System.out.println(b + ": " + System.identityHashCode(b));
}

public static void main(String[] args) {
    String a = buildSampleString();
    performTest(a);
    performComputation(); // see below for details
    performTest(a);
}

buildSampleString() produces new, equal strings every time it is called. One of these instances, a, is kept throughout the lifetime of the program. performTest(a) builds a new one, b, interns it, then compares it to a (both for identity and equality), and as expected, they are equal but not identical. The latter is because a was not interned.

The documentation for String.intern() says:

It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

From transitivity of equals(), the two b strings are equal, so per the documentation, they are identical. Calling System.identityHashCode(b) should therefore return the same value. And it sometimes does, but only if performComputation() doesn't do too much work in the middle. If it does work too hard -- and my suspicion is that it has to do with thrashing the heap -- then System.identityHashCode(b) returns a different value the second time... which should be impossible if the documentation is correct.

This is the code for performComputation:

private static final Random random = new Random();

private static String randomString() {
    StringBuilder builder = new StringBuilder();
    for (int i = 0; i < 10000000; i++) {
        builder.append((char)(random.nextInt(127 - 32) + 32));
    }
    return builder.toString();
}

private static void performComputation() {
    for (int i = 0; i < 10; i++) {
        String s = randomString();
        System.out.println(s.substring(0, 3) + "..." + s.substring(s.length() - 3));
    }
}

If I change the loop from 10000000 iterations to 10, then I get the same identity hash.

What exactly is going on here?

Edit: Full code to reproduce the behavior:

import java.util.Random;

public class Main {

    private static final Random random = new Random();

    private static String randomString() {
        StringBuilder builder = new StringBuilder();
        for (int i = 0; i < 10000000; i++) {
            builder.append((char)(random.nextInt(127 - 32) + 32));
        }
        return builder.toString();
    }

    private static void performComputation() {
        for (int i = 0; i < 10; i++) {
            String s = randomString();
            System.out.println(s.substring(0, 3) + "..." + s.substring(s.length() - 3));
        }
    }

    // ----------------------------------------------------------------------------------------------------------------

    private static String buildSampleString() {
        StringBuilder builder = new StringBuilder();
        for (int i = 0; i < 10; i++) {
            builder.append((char)(i + 'a'));
        }
        return builder.toString();
    }

    private static void performTest(String a) {
        String b = buildSampleString().intern();
        System.out.println("a vs. b: " + (a == b) + ", " + a.equals(b));
        System.out.println(b + ": " + System.identityHashCode(b));
    }

    public static void main(String[] args) {
        String a = buildSampleString();
        performTest(a);
        performComputation();
        performTest(a);
    }
}

Upvotes: -3

Views: 114

Answers (1)

Sotirios Delimanolis
Sotirios Delimanolis

Reputation: 280132

From what I can tell from your post and comments

The docs for intern() claim that the same object is returned for equal strings

your misunderstanding begins with the fact that

String a = buildSampleString();
String b = buildSampleString().intern();
System.out.println("a vs. b: " + (a == b) + ", " + a.equals(b));

Returns

a vs. b: false, true

In other words, why are a and b referencing different objects?

Each invocation to buildSampleString returns a new object. The first invocation is assigned to a. Nothing in your program ever changes (reassigns) a. The second invocation also returns a new object and then intern adds it to the pool and returns a reference to it, which is stored in b.

a and b are referencing different objects. == will therefore return false.

Upvotes: 1

Related Questions