Reputation: 51
In this situation where one member is edited to become equal to another, what is the proper way to force the HashSet to recalculate hashes and thereby purge itself of duplicates?
I knew better than to expect this to happen automatically, so I tried such things as intersecting the HashSet with itself, then reassigning it to a constructor call which refers to itself and the same EqualityComparer. I thought for sure the latter would work, but no.
One thing which does succeed is reconstructing the HashSet from its conversion to some other container type such as List, rather than directly from itself.
Class defs:
public class Test {
public int N;
public override string ToString() { return this.N.ToString(); }
}
public class TestClassEquality: IEqualityComparer<Test> {
public bool Equals(Test x, Test y) { return x.N == y.N; }
public int GetHashCode(Test obj) { return obj.N.GetHashCode(); }
}
Test code:
TestClassEquality eq = new TestClassEquality();
HashSet<Test> hs = new HashSet<Test>(eq);
Test a = new Test { N = 1 }, b = new Test { N = 2 };
hs.Add(a);
hs.Add(b);
b.N = 1;
string fmt = "Count = {0}; Values = {1}";
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
hs.IntersectWith(hs);
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
hs = new HashSet<Test>(hs, eq);
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
hs = new HashSet<Test>(new List<Test>(hs), eq);
Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
Output:
"Count: 2; Values: 1,1"
"Count: 2; Values: 1,1"
"Count: 2; Values: 1,1"
"Count: 1; Values: 1"
Based on the final approach succeeding, I could probably create an extension method in which the HashSet dumps itself into a local List, clears itself, and then repopulates from said list.
Is that really necessary or is there some simpler way to do this?
Upvotes: 4
Views: 1046
Reputation: 111870
There is no other way than recreating the HashSet<>
. Sadly the HashSet<>
constructor has an optimization so that if it is create from another HashSet<>
it copies the hash codes... So we can cheat:
hs = new HashSet<Test>(hs.Skip(0), eq);
The hs.Skip(0)
is a IEnumerable<>
, not an HashSet<>
. This defeats the HashSet<>
check.
Note that there is no guarantee that in the future the Skip()
won't implement a shortcircuit in case of 0, something like:
if (count == 0)
{
return enu;
}
else
{
return count elements;
}
(see Lippert's comment, false problem)
The "manual" method to do it is:
var hs2 = new HashSet<Test>(eq);
foreach (var value in hs)
{
hs2.Add(value);
}
hs = hs2;
So enumerate "manually" and readd.
Upvotes: 3
Reputation: 660159
Lasse's comment is correct: you are required by the contract of HashSet to not do this, so asking what to do when you do this is a non-starter. If it hurts when you do that, stop doing that. A mutable object must not be put into a hash set if a mutation will cause its hash value to change while it is in the set. You're in a cleft stick of your own making.
To get out of that cleft stick, you could:
Upvotes: 11
Reputation: 311508
As you saw, HashSet
s don't deal with mutable objects when modifying the object affects its hash code or equality to other objects. Just remove it and re-add it:
hs.Remove(b);
b.N = 1;
hs.Add(b);
Upvotes: 2