Reputation: 19270

Using a cached hashCode() value in equals()

For immutable class String; String :: hashCode computation will happen only once in life time of that object. So calling hashCode() after the first time is always just returning private int hash. No CPU will be wasted on computation.

public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {    // **h != 0 in second time**
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

As we know the contract between hashcode and equals as

equal objects must produce the same hash code

So i believe below piece of code will improve the performance on string equals(). This may be redundant in hashmap, since hashmap already got the bucket based on hashcode, but this has good performance improvement on BIG List search.

//Does the below will improve equals performance on BIG LIST?
if (this.hashCode() != anObject.hashCode()) {
        return false;
}

Please comment your thoughts the below api.

public boolean equals(Object anObject) {
if (this == anObject) {
    return true;
}

if (this.hashCode() != anObject.hashCode()) {
    return false;
}

if (anObject instanceof String) {
    String anotherString = (String)anObject;
    int n = count;
    if (n == anotherString.count) {
    char v1[] = value;
    char v2[] = anotherString.value;
    int i = offset;
    int j = anotherString.offset;
    while (n-- != 0) {
        if (v1[i++] != v2[j++])
        return false;
    }
    return true;
    }
}
return false;
}

UPDATE

As mentioned by 'dasblinkenlight' there is some cpu cycles required only at the first time of hashcode() API is called.

Since Java is maintaining String Pool; and if application requires large String multiple time comparison other than at hashing; then we can go for Utility method like below.

public boolean StringCompare (String one, String two) {

     if (one == two) {
         return true;
     }

     if (one.hashCode() != two.hashCode()) {
        return false;
     }


    int n = one.count;
    if (n == two.count) {
    char v1[] = one.value;
    char v2[] = two.value;
    int i = one.offset;
    int j = two.offset;
    while (n-- != 0) {
        if (v1[i++] != v2[j++])
        return false;
    }
    return true;

}

Upvotes: 1

Answers (3)

Dev

Reputation: 12206

Your optimization would require more work on String objects that are actually equal to each other, as you'd still have to iterate them to ensure they were equal.

The contract requires that equals objects produce the same hashcode. The reverse is not true, in that, objects which produce the same hashcode are not necessarily equal (a hash collision).

Upvotes: 3

Sergey Kalinichenko

Reputation: 727047

There's nothing wrong in putting a check like that in your own code to save on an equality check when you know that the comparison is going to fail most of the time, but putting this into the general code has a chance to degrade the overall performance of your system for two reasons:

Computing hash code for the first time takes some CPU cycles; when equals method is called outside of the context of a hash container, the CPU cycles needed to compute the hash code would be wasted
When Strings are used as keys in a hash container, the container establishes the equality of hash codes before proceeding with the equality check, so the comparison of hash codes inside the equals() method becomes redundant.

Upvotes: 8

Igor

Reputation: 34011

Java is very good at optimization, you do not need to micro-optimize for it. You should write code that is maintainable and readable, and look into performance after you've looked into sources that may be causing issues (and after many performance tests). Writing obscure or hard-to-read code will most likely result in bugs in the future when you or others are unable to discern why the method is written the way it is.

Have you found that your .equals() is the bottleneck in your performance? If not, I would say stick with the code that is more readable and less likely to introduce bugs in the future. The performance difference in your case is most likely negligible, but you can run tests to compare the two implementations.

Upvotes: 5

Using a cached hashCode() value in equals()

UPDATE

Answers (3)

Related Questions