Rohit Deshpande
Rohit Deshpande

Reputation: 23

Hash code for array size 600 with least collisions

So I'm working with a file that has 400 data values, all ints and ranging from 4 to 20,000 in value. I load all these into an array of size 400. There is another empty array of ListNodes of size 600 that I will move the data to, but using a self-written hash code (I'll post it below).

Because each index in the array of length 600 has a ListNode in it, if there are any collisions, then the data value is added to the back of the ListNode. I also have a method that returns the percent of the array that is null. But basically since I'm loading 400 data values into an array of size 600, the least percent of nulls I can have is 33.3% because if there are no collisions, then 400 slots in the array are taken and 200 are null, but this is not the case:

return (num+123456789/(num*9365))%600; //num is the value read from the array of 400

That hashCode has given me my best result of 48.3% nulls and I need it to be below 47% at least. any suggestions or solutions to imporve this hashCode? I would greatly appreciate any help. If you need any more info or details please let me know. Thank you!!!

Upvotes: 0

Views: 625

Answers (2)

Henry
Henry

Reputation: 43738

I did some experiments with random numbers: generate 400 uniformly distributed random numbers in the range [0, 599] and check how many values in that range are not generated. It turns out, on average 51.3 % of the values are not generated. So your 48.3 % is already better than expected. The target of 47 % seems unrealistic unless some form of perfect hashing is used.

If you want to make some experiments on your own, here is the program.

public static void main(String[] args) {
    Random r = new Random();
    int[] counts = new int[600];
    for (int i = 0; i < 400; i++) {
        counts[r.nextInt(600)]++;
    }
    int n = 0;
    for (int i = 0; i < 600; i++) {
        if (counts[i] == 0) {
            n++;
        }
    }
    System.out.println(100.0 * n / 600);
}

Upvotes: 1

osanger
osanger

Reputation: 2352

I'd use JAVAs implementation of the hashing-algorithm:

Hava a look at open-jdk HashMap

static int hash(int h) {
     // This function ensures that hashCodes that differ only by
     // constant multiples at each bit position have a bounded
     // number of collisions (approximately 8 at default load factor).
     h ^= (h >>> 20) ^ (h >>> 12);
     return h ^ (h >>> 7) ^ (h >>> 4);
 }

Note you have too add a modulo-operation to make sure that the value wouldn't be greater than 600


EDIT 1

>>> is logical shift right

EXAMPLE:

10000000 >>> 2 = 00100000

Upvotes: 0

Related Questions