Custer Barnes
Custer Barnes

Reputation: 51

how to force HashSet to rehash members?

In this situation where one member is edited to become equal to another, what is the proper way to force the HashSet to recalculate hashes and thereby purge itself of duplicates?

I knew better than to expect this to happen automatically, so I tried such things as intersecting the HashSet with itself, then reassigning it to a constructor call which refers to itself and the same EqualityComparer. I thought for sure the latter would work, but no.

One thing which does succeed is reconstructing the HashSet from its conversion to some other container type such as List, rather than directly from itself.

Class defs:

public class Test {
    public int N;
    public override string ToString() { return this.N.ToString(); }
    }
public class TestClassEquality: IEqualityComparer<Test> {
    public bool Equals(Test x, Test y) { return x.N == y.N; }
    public int GetHashCode(Test obj) { return obj.N.GetHashCode(); }
    }

Test code:

    TestClassEquality eq = new TestClassEquality();
    HashSet<Test> hs = new HashSet<Test>(eq);
    Test a = new Test { N = 1 }, b = new Test { N = 2 };
    hs.Add(a);
    hs.Add(b);
    b.N = 1;
    string fmt = "Count = {0}; Values = {1}";
    Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
    hs.IntersectWith(hs);
    Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
    hs = new HashSet<Test>(hs, eq);
    Console.WriteLine(fmt, hs.Count, string.Join(",", hs));
    hs = new HashSet<Test>(new List<Test>(hs), eq);
    Console.WriteLine(fmt, hs.Count, string.Join(",", hs));

Output:

"Count: 2; Values: 1,1"
"Count: 2; Values: 1,1"
"Count: 2; Values: 1,1"
"Count: 1; Values: 1"

Based on the final approach succeeding, I could probably create an extension method in which the HashSet dumps itself into a local List, clears itself, and then repopulates from said list.

Is that really necessary or is there some simpler way to do this?

Upvotes: 4

Views: 1046

Answers (3)

xanatos
xanatos

Reputation: 111870

There is no other way than recreating the HashSet<>. Sadly the HashSet<> constructor has an optimization so that if it is create from another HashSet<> it copies the hash codes... So we can cheat:

hs = new HashSet<Test>(hs.Skip(0), eq);

The hs.Skip(0) is a IEnumerable<>, not an HashSet<>. This defeats the HashSet<> check.

Note that there is no guarantee that in the future the Skip() won't implement a shortcircuit in case of 0, something like:

if (count == 0)
{
    return enu;
}
else
{
    return count elements;
}

(see Lippert's comment, false problem)

The "manual" method to do it is:

var hs2 = new HashSet<Test>(eq);
foreach (var value in hs)
{
    hs2.Add(value);
}
hs = hs2;

So enumerate "manually" and readd.

Upvotes: 3

Eric Lippert
Eric Lippert

Reputation: 660159

Lasse's comment is correct: you are required by the contract of HashSet to not do this, so asking what to do when you do this is a non-starter. If it hurts when you do that, stop doing that. A mutable object must not be put into a hash set if a mutation will cause its hash value to change while it is in the set. You're in a cleft stick of your own making.

To get out of that cleft stick, you could:

  • Stop mutating the objects while they are in a hash set. Remove them before you mutate them, put them back in later.
  • Fix the implementation of equality and hashing on the object so that it is consistent across mutations.
  • When you create the hash set, provide a custom hashing/equality algorithm that does not change its opinions when the object is mutated.
  • Implement your own "set" class that has whatever behaviour you like in this scenario. That is extremely difficult, so be careful. (There is a reason why this restriction was created in the first place!)

Upvotes: 11

Mureinik
Mureinik

Reputation: 311508

As you saw, HashSets don't deal with mutable objects when modifying the object affects its hash code or equality to other objects. Just remove it and re-add it:

hs.Remove(b);
b.N = 1;
hs.Add(b);

Upvotes: 2

Related Questions