oliver
oliver

Reputation: 2901

Set of sets not checking for equality

I am trying to construct a set of sets in the mathematical sense. However If I add another set which is equal to one that is already in the set of sets, but which is represented by a different object, the set gets duplicated.

HashSet<HashSet<string>> setOfSets = new HashSet<HashSet<string>>();
HashSet<string> set1 = new HashSet<string>();    
HashSet<string> set2 = new HashSet<string>();

set1.Add("Foo");
set1.Add("Bar");
set1.Add("Bar"); // set behavior okay, "Bar" is not duplicated

set2.Add("Foo");
set2.Add("Bar"); // now set1 == set2

setOfSets.Add(set1);
setOfSets.Add(set2); // now set1 AND set2 are in setOfSets, which is "wrong"

Why does the set-logic work for equality of strings but not for equality of HashSets themselves? How can I fix this with the least effort.

Upvotes: 2

Views: 226

Answers (1)

oliver
oliver

Reputation: 2901

I have been pointed to this question. Although the accepted answer is helpful for checking equality of sets by hand (by means of HashSet<T>.SetEquals(HashSet<T>)), it doesn't quite help with applying this equality logic to a set of sets.

However the non-accepted answer (by Gregory Adam) gives the crucial hint as to how this can be accomplished, namely with HashSet<string>.CreateSetComparer(). Because HashSet has a constructor that accepts an equality comparer, this is the way to go:

HashSet<HashSet<string>> setOfSets = 
    new HashSet<HashSet<string>>(HashSet<string>.CreateSetComparer());

This tells the outer HashSet how to "properly" (in the mathematical sense) compare objects the type of the inner HashSet (in my case HashSet<string>).

As Hans Passant and Eric Lippert have kindly pointed out, equality comparison of HashSets is comparatively expensive, even more so if it is applied to a nested HashSet, which might be one of the reasons why this hasn't been chosen as the default behavior.

The main reason however, according to Eric Lippert, for choosing reference equality is the mutability of the HashSet objects. Applied to my example: if I add two set-wise different HashSets (that is !set1.SetEquals(set2)) to my setOfSets and afterwards change set1's contents to become equal to set2, there are still two sets in setOfSets, although there should be only one set then. So this has led to a state which is inconsistent with the original requirement ("there shall not be two equal objects in a set").

The same cannot be done with strings because strings are immutable. If I add two different string objects "Foo" and "Bar" to a HashSet, there is no legal way to change them to become equal.

Upvotes: 2

Related Questions