Aelian
Aelian

Reputation: 695

Do java objects used as hash keys need to be 'fully' immutable?

If hashCode() calculation uses immutable fields and equals() uses all the fields would it be a problem when the class is used as a hash key? E.g.

import java.util.Objects;

public class Car {
    protected final long vin;
    protected String state;
    protected String plateNumber;

    public Car( long v, String s, String p ) {
        vin = v; state = s; plateNumber = p;
    }


    public void move( String s, String p ) {
        state = s; plateNumber = p;
    }


    public int hashCode() {
        return (int)( vin % Integer.MAX_VALUE );
    }


    public boolean equals( Object other ) {
        if (this == other) return true;
        else if (!(other instanceof Car)) return false;
        Car otherCar = (Car) other;
        return vin == otherCar.vin
                && Objects.equals( state, otherCar.state )
                && Objects.equals( plateNumber, otherCar.plateNumber );
    }
}

And move() is called on a car object after it is inserted into a hashset, possible via a reference kept elsewhere.

I am not after performance issues here. Only correctness.

I have read java hashCode contact, few answers on SO including this by venerable Jon Skeet and this from big blue. I feel that the last link gives the best explanation and imply that above code is correct.

Edit

Conclusion:
This class satisfy constraints placed on ‘equals()’ and ‘hashCode()’ in java. However it violates restrictions additional requirements placed on ‘equals()’ when used as keys in collections, hashed or not.
The additional requirement is that ‘equals()’ need to be consistent as long as the object is a key.
See the counter example by Louis Wasserman and the reference provided by Douglas below.

Few clarifications:

A) This class satisfy java object level constraints:

  1. ( carA == carB ) implies ( carA.hashCode() == carB.hashCode() )
  2. ( carA.hashCode() != carB.hashCode() ) implies ( carA != carB )
  3. equals() need to be reflexive, symmetric, transitive.
  4. hashCode() need to be consistent. i.e. Cannot change for an object during its lifetime.
  5. equals() need to be consistent as long as neither object is modified.

Note that the reverse of ‘1.’ and ‘2.’ are not necessary. And the class above satisfies all the conditions.
Also java docs mention "equals() … implements the most discriminating possible equivalence relation on objects", but not sure if that is compulsory.

B) As for performance, the increment in collision avoidance probability decrease with each successive member variable we combine. Usually few well chosen member variables is sufficient.

Upvotes: 2

Views: 818

Answers (7)

mario
mario

Reputation: 1162

Hash works by putting items into "buckets". Each bucket is calculated by the hashcode. After finding the bucket then the search continues comparing each item one by one using equals.

For example:

  • During insertion: an object whose id is 100 is placed in bucket 5 (the hashcode calculated 5).
  • During retrieval: you ask the hashmap to find the item 100. If the hash calculates 7 now then the algorithm will search for your object in bucket 7 but your object will never be found as it is dwelling in bucket 5.

In summary: the hash code and the actual key work together. The former is used to know in which bucket the item should be. The latter is used by the equals comparison seeking the actual item to return.

Upvotes: 1

user3458
user3458

Reputation:

Short answer: it should be OK, but prepare for bizarre behavior.

Longer answer: when you change fields that participate in equals() on a key, the value keyed by that key will no longer be found.

Still longer answer: this looks as X/Y problem: you're asking about X, but you really need X to accomplish Y. Maybe you should ask about Y?

The car in your case is uniquely identified by vin. A car equals to itself. But, a car can be registered in different states. Maybe the answer is to have a Registration object (or a few of them) attached to the car? And then you can separate car.equals() from registration.equals().

Upvotes: 1

Douglas
Douglas

Reputation: 5087

When considering only the hashCode and equals contracts, you are correct that this implementation satisfies their requirements. hashCode using a strict subset of the fields that equals uses is sufficient to guarantee that a.equals(b) implies a.hashCode() == b.hashCode() as required.

However, things change when you bring in Map. From the Map javadoc, "The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map."

After you call move on a Car that is a key in a Map, the behavior of that Map is now unspecified. In many cases it will in practice still work the way you want it to, but bizarre things could happen in ways that are hard to predict. While it would technically be valid for the Map to spontaneously empty itself or switch all lookups to use a random number generator, a more likely scenario might go like this:

Car car1 = ... Car car2 = ... // a copy of car1 Map<Car, String> map1 = ... map1.put(car1, "value"); assert map1.get(car2).equals("value"); // true car1.move(...); assert map1.get(car2).equals("value"); // NullPointerException on the equals call, car2 is no longer found

Notice that neither car2 nor the Map were changed themselves in any way, but the mapping of car2 changed (or rather, disappeared) anyway. This behavior is not officially specified, but I would guess most Map implementations do behave this way.

Upvotes: 2

Bartosz Bilicki
Bartosz Bilicki

Reputation: 13235

You may mutate your key candidates as much as you want, before or after (not during) they are used as keys. In practice, it is very hard to enforce this rule. If you mutate objects you do not have a control if somebody uses them as keys or not.

Immutability for keys is just easier, removes source of subtle, hard-to-find bugs and just work better for key.

In your case I see no correctness issues. But why you ever bother not to include all fields in hashcode?

Upvotes: 1

oopexpert
oopexpert

Reputation: 755

The short answer is: No.

Long answer:

Fully immutability is not neccessary. BUT:

Equals must only depend on immutable values. Hashcode must depend on immutable values either a constant or a subset of the values used in equals or all values used in equals. Values that are not mentioned within equals mustn't be part of hashcode.

If you mutate values equals and hashcode rely on it is likely that you do not find your objects again in a hash based datastructure. Look at this:

public class Test {


    private static class TestObject {
        private String s;

        public TestObject(String s) {
            super();
            this.s = s;
        }

        public void setS(String s) {
            this.s = s;
        }

        @Override
        public boolean equals(Object obj) {
            boolean equals = false;
            if (obj instanceof TestObject) {
                TestObject that = (TestObject) obj;
                equals = this.s.equals(that.s);
            }
            return equals;
        }

        @Override
        public int hashCode() {
            return this.s.hashCode();
        }
    }

    public static void main(String[] args) {

        TestObject a1 = new TestObject("A");
        TestObject a2 = new TestObject("A");

        System.out.println(a1.equals(a2)); // true

        HashMap<TestObject, Object> hashSet = new HashMap<>(); // hash based datastructure

        hashSet.put(a1, new Object());

        System.out.println(hashSet.containsKey(a1)); // true

        a1.setS("A*");

        System.out.println(hashSet.containsKey(a1)); // false !!! // Object is not found as the hashcode used initially before mutation was used to determine the hash bucket

        a2.setS("A*");

        System.out.println(hashSet.containsKey(a2)); // false !!! Because a1 is in wrong hash bucket ...

        System.out.println(a1.equals(a2)); // ... even if the objects are equals

    }

}

Upvotes: 0

algrid
algrid

Reputation: 5954

When your hashCode() implementation uses only limited number of fields (vs equals) you're reducing performance of almost any algorithm/data structure that uses hashing: HashMap, HashSet etc. You're increasing collision probability - it's the situation when two different objects (equals return false) have the same hash value.

Upvotes: 0

Louis Wasserman
Louis Wasserman

Reputation: 198143

It's correct if you never, ever call move after the Car is in the map. Otherwise it's wrong. Both hashCode and equals have to stay consistent after a key is in the map.

Upvotes: 2

Related Questions