Guillem
Guillem

Reputation: 57

Create an unique hashCode based on many values

I am trying to implement an unique hashCode based on six different values. My Class has the following attributes:

private int id_place;
private String algorithm;
private Date mission_date;
private int mission_hour;
private int x;
private int y;

I am calculating the hashCode as following:

id_place * (7 * algorithm.hashCode()) + (31 * mission_date.hashCode()) + (23 * mission_hour + 89089) + (x * 19 + 67067) + (y * 11 + 97097);

How can I turn it into an unique hashCode? I'm not confident it is unique...

Upvotes: 4

Views: 8128

Answers (5)

JineshEP
JineshEP

Reputation: 748

Because you have multiple fields, use:

public int hashCode() {
    return Objects.hash(id_place, algorithm, mission_date, mission_hour, x, y);
}

If objA.equals(objB) is true, then objA and objB must return the same hash code. If objA.equals(objB) is false, then objA and objB might return the same hash code, if your hashing algo happens to return different hashCodes in this case, it ise good for performance reasons.

 public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;
    ClassA classA = (ClassA) o;
    return id_place == classA.id_place &&
            mission_hour == classA.mission_hour &&
            x == classA.x &&
            y == classA.y &&
            Objects.equals(algorithm, classA.algorithm) &&
            Objects.equals(mission_date, classA.mission_date);
}

Upvotes: 0

Joop Eggen
Joop Eggen

Reputation: 109547

Unique is not a hard requirement, but the more unique the hash code is, the better.

Note first that the hash code in general is used for a HashMap, as index into a 'bucket.' Hence optimally it should be unique modulo the bucket size, the number of slots in the bucket. However this may vary, when the map grows.

But okay, towards an optimal hash code:

  • Ranges are important; if x and y where in 0..255, then they could be packed uniquely in two bytes, or when 0..999 then y*1000+x. For LocalDateTime, if one could take the long in seconds (i.o. ms or ns), and since 2012-01-01 so you might assume a range from 0 upto say two years in the future.
  • You can explore existing or generate plausible test data. One then can mathematically optimize your hash code function by their coincidental coefficients (7, 13, 23). This is linear optimisation, but one can also do it by simple trial-and-error: counting the clashes for varying (A, B, C).

    //int[] coeffients = ...;
    int[][] coefficientsCandidates = new int[NUM_OF_CANDIDATES][NUM_OF_COEFFS];
    ...
    int[] collisionCounts = new int[NUM_OF_CANDIDATES];
    for (Data data : allTestData) {
        ... update collisionCounts for every candidate
    }
    ... take the candidate with smallest collision count
    ... or sort by collisionCounts and pick other candidates to try out
    

In general such evaluation code is not needed for a working hash code, but especially it might detect bad hash codes, were there is some pseudo-randomness going wrong. For instance if a factor is way too large for the range (weekday * 1000), so value holes appear.

But also one has to say in all honesty, that all this effort probably really is not needed.

Upvotes: 2

prachi
prachi

Reputation: 305

HashCode for two different object needs not be unique. According to https://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#hashCode() -

  1. Whenever it is invoked on the same object more than once during an execution of a Java application, hashCode() must consistently return the same value, provided no information used in equals comparisons on the object is modified. This value needs not remain consistent from one execution of an application to another execution of the same application
  2. If two objects are equal according to the equals(Object) method, then calling the hashCode() method on each of the two objects must produce the same value.
  3. It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hash tables.

So , you don't have to create hashCode() function which returns distinct hash code everytime.

Upvotes: 2

deHaar
deHaar

Reputation: 18568

In Eclipse, there is a function that generates the method public int hashCode() for you. I used the class attributes you provided and the result is as follows:

@Override
public int hashCode() {
    final int prime = 31;
    int result = 1;
    result = prime * result + ((algorithm == null) ? 0 : algorithm.hashCode());
    result = prime * result + id_place;
    result = prime * result + ((mission_date == null) ? 0 : mission_date.hashCode());
    result = prime * result + mission_hour;
    result = prime * result + x;
    result = prime * result + y;
    return result;
}

It looks a lot like your calculation. However, as Andy Turner pointed out in a comment to your question and Eran in an answer, you simply cannot make a unique hash code for every single instance of an object if their amount exceeds the maximum amount of possible different hash codes.

Upvotes: 1

Eran
Eran

Reputation: 393811

It doesn't have to be unique and it cannot be unique. hashCode() returns an int (32 bits), which means it could be unique if you only had one int property and nothing else.

The Integer class can (and does) have a unique hashCode(), but few other classes do.

Since you have multiple properties, some of which are int, a hashCode() that is a function of these properties can't be unique.

You should strive for a hasCode() function that gives a wide range of different values for different combinations of your properties, but it cannot be unique.

Upvotes: 4

Related Questions