Reputation: 5
I'm trying to implement cuckoo hashing with hash functions: hash1: key.hashcode() % capacity hash2: key.hashcode() / capacity % capacity
With an infinite loop check and rehashing method doubling capacity. The program works fine with small amount of data, but when data gets big (around 20k elements) the program keeps getting rehashing until the capacity gets overflowed.
I figured that mostly the infinite rehashing causes by the data with exactly same hashcode. After rehashing, there will be chance other data get same hashcode and causing rehashing again.
I already use Java built-in hashcode but the chance of same hashcodes still high when data is large. Even I modified a little bit hashcode method, eventually there is still data with same hashcode.
So which hash method should I use to prevent this?
Upvotes: 0
Views: 723
Reputation: 5277
A usual method to create a hash function is generally to use primes. I write a function (below), with which, I don't guarantee no collisions, but it should be lessened.
hashFunction1(String s){
int k = 7; //take a prime number, can be anything (I just chose 7)
for(int i = 0; i < s.length(); i++){
k *= (23 * (int)(s.charAt(i)));
k %= capacity;
}
}
//23 is another randomly chosen number.
You can write a similar hash function as hashFunction2, choosing two different prime numbers. But here, the main problem is, for strings "stop" and "pots", this gives same hash code.
So, an improvization over this function can be:
hashFunction1(String s){
int k = 7; //take a prime number, can be anything (I just chose 7)
for(int i = 0; i < s.length(); i++){
k *= (23 * (int)(s.charAt(i)));
k += (int)(s.charAt(i));
k %= capacity;
}
}
which will resolve this (for most cases, if not all).
If you still find this function bad, instead of s.charAt(i), you can use a unique prime number mapped to every character, ie. a=3, b=5, c=7, d=11 and so on. This should resolve collision even more.
EDIT:
+n
, which is a constant.2
is not the prime to be used in such cases. Use an odd prime number, 3 works.Upvotes: 1