Why Hibernate requires us to implement equals/hashcode methods when I have a private id field?

First, consider the snippet,

public class Employee
{
    private Integer id;
    private String firstname;
    private String lastName;
    private String department;
 // public getters and setters here, i said PUBLIC
}

I create 2 objects with same ids and rest of all the fields are also same.

Employee e1 = new Employee();
Employee e2 = new Employee();

e1.setId(100);
e2.setId(100);

//Prints false in console
System.out.println(e1.equals(e2));

The whole problem starts here In a real time application, this must return true.

Consequently, everyone knows a solution exists (to implement equals() and hashcode())

public boolean equals(Object o) {
    if(o == null)
    {
        return false;
    }
    if (o == this)
    {
        return true;
    }
    if (getClass() != o.getClass())
    {
        return false;
    }

    Employee e = (Employee) o;
    return (this.getId() == e.getId());

}
@Override
public int hashCode()
{
    final int PRIME = 31;
    int result = 1;
    result = PRIME * result + getId();
    return result;
}

Now, as usual:

        Employee e1 = new Employee();
        Employee e2 = new Employee();

        e1.setId(100);
        e2.setId(100);

        //Prints 'true' now
        System.out.println(e1.equals(e2));

        Set<Employee> employees = new HashSet<Employee>();
        employees.add(e1);
        employees.add(e2);

        //Prints ofcourse one objects(which was a requirement)
        System.out.println(employees);

I am going through this excellent article Don't Let Hibernate Steal Your Identity. But one thing I have failed to understand completely. The whole problem and its solution discussed above and the linked article were dealing the problems when the 2 Employee object ids were same.

Consider when we have a private setter for id field with the id field generated by the generator class provided in hbm.xml. As soon as i start to persist the Employee objects(and in no way i would be able to change the id), i find no need to implement equals and hashcode methods. I am sure i am missing something, since my intuition says when a particular concept is too much rotated over the web, it must have always been laid in front of you for the sake of avoiding some common errors ? Do i still have to implement those 2 methods when i have a private setter for id field?

Upvotes: 4

Views: 5325

Answers (1)

Vlad Mihalcea
Vlad Mihalcea

Reputation: 153950

If the entity defines a natural business key, then you should use that for equals and hashCode. The natural identifier or business key is consistent across all entity state transitions, hence the hashCode will not change when the JPA entity state changes (e.g. from New to Managed to Detached).

In your example, you are using the assigned identifier, which doesn't change when you persist your entity.

However, if you don't have a natural identifier and you have a generated PRIMARY KEY (e.g., IDENTITY, SEQUENCE), then you can implement equals and hashCode like this:

@Entity
public class Book implements Identifiable<Long> {
 
    @Id
    @GeneratedValue
    private Long id;
 
    private String title;
 
    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
 
        if (!(o instanceof Book))
            return false;
 
        Book other = (Book) o;
 
        return id != null &&
               id.equals(other.getId());
    }
 
    @Override
    public int hashCode() {
        return getClass().hashCode();
    }
 
    //Getters and setters omitted for brevity
}

The entity identifier can be used for equals and hashCode, but only if the hashCode returns the same value all the time. This might sound like a terrible thing to do since it defeats the purpose of using multiple buckets in a HashSet or HashMap.

However, for performance reasons, you should always limit the number of entities that are stored in a collection. You should never fetch thousands of entities in a @OneToMany Set because the performance penalty on the database side is multiple orders of magnitude higher than using a single hashed bucket.

The reason why this version of equals and hashCode works is that the hashCode value does not change from one entity state to another, and the identifier is checked only when it's not null.

Upvotes: 5

Related Questions