Reputation: 4806
I'm using a hash table (DotNET Dictionary object) as part of a sparse two-dimensional data set. Most of the entries in the hash table will be close together. I'll probably end up with 100 ~ 10,000 entries, all of them clustered near zero. I've read that a Hash table performs better when the hashes are spread across the entire integer(32 bit) range.
Is there a cheap way to map consecutive integers onto wildly different values in a 1:1 fashion? I don't have to map them back, it's purely a one-way thing.
Upvotes: 0
Views: 303
Reputation: 16909
Maybe I'm misunderstanding what you are saying, but Dictionary will already hash your integers. There shouldn't be any need to pre-hash them. Why not try out the default implementation and see how it goes instead of attempting a pre-optimizion that will in all likelihood be pointless.
Upvotes: 3
Reputation: 300559
If you know the maximum value of your keyset (kmax), you could expand by a constant factor (multiplier), say multiply by a fixed prime number that keeps the product below the max integer size (2^31 - 1):
i.e. the nearest prime number to (2^30) / kmax
Note: make sure the prime used is not the same as the number of buckets in the Hash table.
Here is another solution: Since the .NET Random class will generate the same value for the same seed, you could use that to distribute the incoming keys.
Upvotes: 1
Reputation: 655
Instead of using Integer, write a class that Inherits from Integer, and override the GetHashCode function. This way you don't have to do anything but create this function!
The easiest way I can think of to spread out the values evenly is to do something like:
public class MyInteger:Integer
{
public override int GetHashCode()
{
unchecked
{
return (int)Math.Pow(this,this);
}
}
}
Nice and evenly split up, while keeping the effort to a minimum.
Upvotes: 1