Mikael Sundberg
Mikael Sundberg

Reputation: 783

Hashcode for objects with only integers

How do you in a general (and performant) way implement hashcode while minimizing collisions for objects with 2 or more integers?

update: as many stated, you cant ofcource eliminate colisions entierly (honestly didnt think about it). So my question should be how do you minimize collisions in a proper way, edited to reflect that.

Using NetBeans' autogeneration fails; for example:

public class HashCodeTest {
    @Test
    public void testHashCode() {
        int loopCount = 0;
        HashSet<Integer> hashSet = new HashSet<Integer>();
        for (int outer = 0; outer < 18; outer++) {
            for (int inner = 0; inner < 2; inner++) {
                loopCount++;
                hashSet.add(new SimpleClass(inner, outer).hashCode());
            }
        }
        org.junit.Assert.assertEquals(loopCount, hashSet.size());
    }

    private class SimpleClass {
        int int1;
        int int2;

        public SimpleClass(int int1, int int2) {
            this.int1 = int1;
            this.int2 = int2;
        }

        @Override
        public int hashCode() {
            int hash = 5;
            hash = 17 * hash + this.int1;
            hash = 17 * hash + this.int2;
            return hash;
        }
    }
}

Upvotes: 0

Views: 2948

Answers (5)

yshavit
yshavit

Reputation: 43391

As others have said, it's more important to minimize collisions that to eliminate them -- especially since you didn't say how many buckets you're aiming for. It's going to be much easier to have zero collisions with 5 items in 1000 buckets than if you have 5 items in 2 buckets! And even if there are plenty of buckets, your collisions could look very different with 1000 buckets vs 1001.

Another thing to note is that there's a good chance that the hash you provide won't even be the one the HashMap eventually uses. If you take a look at the OpenJDK HashMap code, for instance, you'll see that your keys' hashCodes are put through a private hash method (line 264 in that link) which re-hashes them. So, if you're going through the trouble of creating a carefully constructed custom hash function to reduce collisions (rather than just a simple, auto-generated one), make sure you also understand who's going to use it, and how.

Upvotes: 0

barsju
barsju

Reputation: 4446

This is what eclipse auto-generates:

@Override
public int hashCode() {
    final int prime = 31;
    int result = 1;
    result = prime * result + getOuterType().hashCode();
    result = prime * result + int1;
    result = prime * result + int2;
    return result;
}

And with this code your testcase passes...

PS: And don't forget to implement equals()!

Upvotes: 1

Jeffrey
Jeffrey

Reputation: 44808

Creating a hash method with zero collisions is impossible. The idea of a hash method is you're taking a large set of objects and mapping it to a smaller set of integers. The best you can do is minimize the number of collisions you get within a subset of your objects.

Upvotes: 0

TacticalCoder
TacticalCoder

Reputation: 6325

Can you in a general (and performant) way implement hashcode without colisions for objects with 2 or more integers.

It is technically impossible to have zero collision when hashing to 32 bits (one integer) something made of more than 32 bits (like 2 or more integers).

Upvotes: 1

Louis Wasserman
Louis Wasserman

Reputation: 198093

There is no way to eliminate hash collisions entirely. Your approach is basically the preferred one to minimize collisions.

Upvotes: 0

Related Questions