Reputation: 51817
I have the following class:
Public Class Pair(Of T1, T2)
Public Property First As T1
Public Property Second As T2
Public Sub New(Optional ByVal first As T1 = Nothing, Optional ByVal second As T2 = Nothing)
Me.First = first
Me.Second = second
End Sub
Public Overrides Function tostring() As String
Return String.Format("<[{0},{1}]>", First, Second)
End Function
Public Overrides Function GetHashCode() As Integer
Return Integer.Parse(String.Format("{0}{1}", First.GetHashCode, Second.GetHashCode))
End Function
End Class
However, when I create a dictionary using Pair as the key:
Dim Pairs as Dictionary(Of Pair(Of Integer, Integer), String)
Dim p = new Pair(of integer, integer)(1234, 13)
Dim p2 = new Pair(of integer, integer)(1234, 13)
console.writeline(String.Format("Hash 1:{0} Hash 2:{1}", p.gethashcode(), p2.gethashcode()))
Pairs.add(p, "Hello")
Console.WriteLine(Pairs(p2))
My expectation is that since both p
and p2
have the hash code of 123413
they would hit the same dictionary element and that the WriteLine
would display "Hello". What really happens, however, is that I get a KeyNotFoundException
, which leads me to believe that the Dictionary (Of...)
doesn't actually use the GetHashCode
method.
So what do I need to do to make both of these Pairs refer to the same dictionary element?
Thanks!
Upvotes: 0
Views: 1773
Reputation: 6044
A few things:
GetHashCode
is used by the Dictionary to figure out where to store the key internally, but it's not an exact match scenario. In ideal circumstances, the hash code should map each key to a unique slot index, making lookups extremely quick.
In practice, values in a Dictionary are stored at an index in an array . With 2^32 different types of hash codes, it is infeasible to create an array index for each hash code, so the Dictionary transforms the hash code into an array index where the values are stored. Because of this, the Dictionary experiences what are called "hash collisions". This means that different keys will map to the same hash value.
Dictionary's are good at managing this, but ultimately when two or more hash codes create the same index (which will happen when the collection gets big enough), the Equals method must determine which key to use to locate the Key/Value pair that contains the value you are after. If Equals is false for all items in the bucket, then it returns the KeyNotFoundException
you experienced.
On to the code:
While you could override Equals
, I don't see why you need to. For starters, I would get rid of your GetHashCode
. You will eventually have problems with it, as shown here:
Dim p = new Pair(of integer, integer)(Int32.MaxValue, Int32.MaxValue)
p.gethashcode() 'BOOM!!!
Instead, based on what you are doing here, I'd recommend that you convert your Pair class to a struct
(Structure in VB), leaving Equals and GetHashCode alone. This is really only a good idea if you are assigning value types (int, byte, bool, etc...) to the Pair because of performance reasons. I would really consider this though.
If you have to have a class, create a representative key that returns a type that is suitable for the Dictionary. For instance, because KeyValuePair
is a value type, it will be compared based on its value, not a reference.
Public Function GetKey() As KeyValuePair(Of T1, T2)
Return New KeyValuePair(Of T1, T2)(First, Second)
End Function
And your Dictionary becomes
Dim Pairs as Dictionary(Of KeyValuePair(Of Integer, Integer), String)
Pairs.add(p.GetKey(), "Hello")
Console.WriteLine(Pairs(p2.GetKey()))
(If there are any syntax errors, it is because I am not a VB programmer.)
Upvotes: 1
Reputation: 117084
You have a number of issues you have to deal with here.
To start with - you must also override Equals
as GetHashCode
is only meant to be a fast method to determine if two objects are not equal. It never is meant to tell you if the objects are equal. That's what Equals
is for - it is meant to be the final check that two objects are equal and it could be much slower to compute than GetHashCode
.
Here's an example. Say you have three very long strings, two that are the same length and the other different. If GetHashCode
returned the length of the strings then you could very quickly determine that the third string is definitely not equal to the first two. You'd have to check the actual contents of the first two to see if they are equal and this could be a lengthy process comparatively.
The next thing, and this is equally important, you cannot have a hash code that changes during the lifetime of the object.
The key word is cannot. It will break things.
The Dictionary(Of ,)
class uses a series of "buckets" to quickly find values based on the hash code of the key. So, if your key's hash code changes after it was added to the dictionary then the dictionary will not be able to find it and it will allow you to add the key twice!
Here's an example:
Dim d = New Dictionary(Of Pair(Of Integer, Integer), String)
Dim p = new Pair(Of Integer, Integer)(10, 11)
d.Add(p, "James")
Dim before = d(p) ' Found!
p.First = 12
Dim after = d(p) ' NOT Found!
Or this:
Dim d = New Dictionary(Of Pair(Of Integer, Integer), String)
Dim p = new Pair(Of Integer, Integer)(10, 11)
d.Add(p, "James")
p.First = 12
d.Add(p, "Tom") ' ALLOWED!
This is why mutable structs are bad.
I hope this helps.
Upvotes: 0
Reputation: 1500903
Having the same hash code isn't enough - the two keys need to be equal too, i.e. key1.Equals(key2)
has to be true (or the equivalent under a custom comparer).
You haven't overridden Equals
, so two Pair
objects are always unequal.
(Note that your hash code function will also fail in various ways, such as if they're both negative. Why not just combine the two integer values in some way?)
I don't know VB well enough to come up with the suitable override myself when I ought to be going to bed, but in C# it would be something like:
public override bool Equals(object other)
{
if (other == null)
{
return false;
}
if (other.GetType() != this.GetType())
{
return false;
}
var otherPair = (Pair<T1, T2>) other;
return EqualityComparer<T1>.Default(this.First, otherPair.First) &&
EqualityComparer<T2>.Default(this.Second, otherPair.Second);
}
(I'd use EqualityComparer<T>.Default
for the hash code generation too, by the way.)
Upvotes: 3