Kawu
Kawu

Reputation: 14003

Implementation of equals() and hashCode() when no natural key is available?

This question is basically a follow-up to questions:

Should I write equals() methods in JPA entities? and What is the best practice when implementing equals() for entities with generated ids

Some background first...

You can regularly encounter the following primary key constellations:

  1. Natural keys (business keys): usually a set of real, multi-column attributes of the entity
  2. Artificial keys (surrogate keys): meaningless, usually auto-increment (IDENTITY, AUTO_INCREMENT, AUTOINCREMENT, SEQUENCE, SERIAL, ...) IDs
  3. Hybrid keys (semi-natural/semi-artificial keys): usually consisting of an artificial ID and some additional, natural column/s, e.g any table that references another table which uses an ID and extends that key (entity_id, ordinal_nbr) or similar.

Frequent scenario: many-to-one references to a root, branch, or leaf inheritance table, which all share a common, "stupid" ID via identifying relationship/dependent key. Root (and branch) tables often make sense when another table needs to reference all entity types, e.g. PostAddresses -> Contacts, where Contacts has sub tables Persons, Clubs, and Facilities, which have nothing in common but being "contactable".

Now to JPA:

In Java, we can create new entity objects whose PK may be incomplete (null or partly null), an entity (row) that a DBMS would ultimately prevent us from being inserted into the DB.

However, when working with application code, it's often handy to have new (or detached) entities that can be compared to existing (managed) entities, even if the new entity objects don't have a PK value yet. To achieve this for any entities that have natural key columns, use them for implementing equals() and hashCode() (as suggested by the other two SO postings).

Question:

But what do you do when no natural/business key can be determined, as in the case of the Contacts table, which is basically just an ID (plus a discriminator)? What would be a good column selection policy for basing equals() and hashCode() implementations on? (artificial keys 2. and 3. above)

There's obviously not much of a choice...

One (naive) goal would be to achieve the same "transient comparability". Can it be done? If not, what does the general approach look like for artificial ID equals() and hashCode() implementations?


Note: I'm already using Apache EqualsBuilder and HashCodeBuilder... I have intentionally "naivified" my question.

Upvotes: 3

Views: 1616

Answers (3)

Omnaest
Omnaest

Reputation: 3096

I think the subject is more simpler than the discussions point to.

Take the database id(s) if present, otherwise use Object#equals / object identity

Why? If you put a new entity into database JPA does nothing else than mapping a new generated id from database to the entities objects identity. This means on the other hand, that the object identity is a primary key beforehand, too.

The point of the discussion often seems to be the assumption, that two business object with same properties are equal. But they are not. E.g. two addresses with same street and city are only equal if you dont want to have duplicates of address values. But then you make them to a primary key within the database too which leads to the fact that you got the primary keys always for your business objects. If you allow duplicate addresses for your business objects the objects identity is the primary key, since it is the only distinction between two addresses.

After persiting an entity the database id does take the job completely since you can now have clones of the same entity which only shares the same database id. (But now can have several memory locations / objects identities)

Upvotes: 3

Nobody
Nobody

Reputation: 690

One of the commonly suggested techniques is to use UUIDs for identifiers, which have a couple of downsides.

They make for ugly urls, and supposedly there are performance implications of querying entities based on such a long identifier. The long UUIDs also cause your database indexes to become too large.

The advantage of UUIDs is that you don't have to implement a separate hashCode() equals() method for every entity.

The solution I've decided to use in my own projects, is to mix a traditional assigned identifier and also use a UUID internally for the hashCode() equals() methods. It looks something like this:

@Configurable
@MappedSuperclass
@EntityListeners({ModelListener.class})
@SuppressWarnings("serial")
public abstract class ModelBase implements Serializable {

     //~~ Instance Fields =====================================

    @Id
    @GeneratedValue(strategy=GenerationType.IDENTITY)
    @Column(name = "id", nullable = false, updatable=false, unique=true)
     protected Long id;

    @Column(name="__UUID__", unique=true, nullable=false, updatable=false, length = 36)
    private String uuid = java.util.UUID.randomUUID().toString();

    //~ Business Methods =====================================

    @Override
    public String toString() {
        return new ToStringCreator(this)
            .append("id", getId())
            .append("uuid", uuid())
            .append("version", getVersion())
             .toString(); 
    }

    @Override
    public int hashCode() {
        return uuid().hashCode();
    }

    @Override
    public boolean equals(Object o) {
        return (o == this || (o instanceof ModelBase && uuid().equals(((ModelBase)o).uuid())));
     }

    /**
     * Returns this objects UUID.
     * 
     * @return - This object's UUID.
     */
    public String uuid() {
        return uuid;
    }

    //~ Accessor Methods ======================================

    public Long getId() {
        return id;
    }

    @SuppressWarnings("unused")
    private void setId(Long id) {
        this.id = id;
    }

     @SuppressWarnings("unused")
    private String getUuid() {
        return uuid;
    }

    @SuppressWarnings("unused")
    private void setUuid(String uuid) {
        this.uuid = uuid;
     }
}

Just extend ModelBase for all of your entities. The advantage of this technique is that the uuid is assigned as soon as the object is created. But we still have an assigned id we can use in our application code to query specific objects. Basically, the uuid field is never used or even thought about in our application code except for comparison purposes. Works like a charm.

Upvotes: 1

Alex Gitelman
Alex Gitelman

Reputation: 24732

If you can't find a set of properties on the object that will distinguish it from other objects of the same kind then you can't compare those objects, can you? If you provide detailed use case there may be more to it but in case of contact with id and discriminator, in the absence of id you can only compare groups of objects that have the same discriminator. And if groups are guaranteed to only have one element, then discriminator is your key.

Upvotes: 1

Related Questions