Reputation: 4117
am started testing hash function on the uniqueness of the generated HashCodes with my algorithm. And i wrote next text class to test when the same hashCode will be generated.
class Program
{
static void Main(string[] args)
{
var hashes = new List<int>();
for (int i = 0; i < 100000; i++)
{
var vol = new Volume();
var code = vol.GetHashCode();
if (!hashes.Contains(code))
{
hashes.Add(code);
}
else
{
Console.WriteLine("Same hash code generated on the {0} retry", hashes.Count());
}
}
}
}
public class Volume
{
public Guid DriverId = Guid.NewGuid();
public Guid ComputerId = Guid.NewGuid();
public int Size;
public ulong VersionNumber;
public int HashCode;
public static ulong CurDriverEpochNumber;
public static Random RandomF = new Random();
public Volume()
{
Size = RandomF.Next(1000000, 1200000);
CurDriverEpochNumber ++;
VersionNumber = CurDriverEpochNumber;
HashCode = GetHashCodeInternal();
}
public int GetHashCodeInternal()
{
unchecked
{
var one = DriverId.GetHashCode() + ComputerId.GetHashCode() * 22;
var two = (ulong)Size + VersionNumber;
var result = one ^ (int)two;
return result;
}
}
}
GUIDs fields DriverId, ComputerId and int Size are random. I assumed that at some time we will generate the same hash-code. You know it will break work with big collections. Magic was in fact that the retry number when the duplicated hash code is generated are the same! I run sample code for several time and got near the same result: firs run duplicate on 10170 retry, second on 7628, third 7628 and again and again on 7628. Some times i got a little bit others results. Bu in most cases it was on 7628.
It has no explanations for me. Is it error in . NET random generator or what?
Thanks all. Now it is clear the was bug in my code (Matthew Watson). I had to call GetHashCodeIntelrnal() and not GetHashCode(). The best GetHashCode unique results gave me:
public int GetHashCodeInternal()
{
unchecked
{
var one = DriverId.GetHashCode() + ComputerId.GetHashCode();
var two = ((ulong)Size) + VersionNumber;
var result = one ^ (int)two << 32;
return result;
}
}
Bu still on near 140 000 it give same code... i think it is not good because ve have collections near 10 000...
Upvotes: 1
Views: 202
Reputation: 109567
If you change your Console.WriteLine() to also print Volume.Size like so:
Console.WriteLine("Same hash code generated on the {0} retry ({1})", hashes.Count, vol.Size);
you will see that although hashes.Count
is always the same for the first collision, vol.Size
is usually different.
This seems to rule out the random number generator causing this issue - it looks like some strange property of GetHashCodeInternal()
.
Closer inspection reveals that you are calling the wrong hash code function.
This line: var code = vol.GetHashCode();
Should be: var code = vol.HashCode;
Try that instead! Because at the moment you are calling the default .Net GetHashCode()
which is not doing what you want at all.
Upvotes: 2
Reputation: 45083
You will need to pass in the random number generator, having created a single one to be reused, as currently you're creating new instances of them too close together which results in the same seed being used, and hence the same sequence of numbers coming out.
Your results will randomly come out seemingly random at points where the seed is generated from the next ticks/seconds of the seed date. So, just incidental, really.
Upvotes: 1