Jimmy
Jimmy

Reputation: 3274

how to check contents of collections(>2) are same

I have a List. For valid reasons, I duplicate the List many times and use it for different purposes. At some point I need to check if the contents of all these collections are same.

Well, I know how to do this. But being a fan of "short hand" coding(linq...) I would like to know if I can check this EFFICIENTLY with the shortest number of lines of code.

    List<string> original, duplicate1, duplicate2, duplicate3, duplicate4
                                       = new List<string();

        //...some code.....
        bool isequal = duplicate4.sequenceequal(duplicate3) 
         && duplicate3.sequenceequal(duplicate2)
         && duplicate2.sequenceequal(duplicate1) 
         && duplicate1.sequenceequal(original);//can we do it better than this 

UPDATE

Codeinchaos pointed out certain senarios I havent thought of(duplicates and order of list).Though sequenceequal will take care of duplicates the order of the list can be a problem. So I am changing the code as follows. I need to copy the Lists for this.

List<List<string>> copy = new List<List<int>> { duplicate1, duplicate2,  
                                                 duplicate3, duplicate4 }; 
bool iseqaul  = (original.All(x => (copy.All(y => y.Remove(x))))
                                         && copy.All(n => n.Count == 0)); 

UPDATE2

Thanks to Eric-using a HashSet can be very efficient as follows. This wont cover duplicates though.

List<HashSet<string>> copy2 =new List<HashSet<string>>{new HashSet<string>(duplicate1),
                                                       new HashSet<string>(duplicate2),
                                                       new HashSet<string> duplicate3),
                                                       new HashSet<string>(duplicate4)};
  HashSet<string> origninalhashset = new HashSet<string>(original);
  bool eq = copy2.All(x => origninalhashset.SetEquals(x));

UPDATE3 Thanks to Eric - The original code in this post with SequenceEqual will work with sorting. As Sequenceequal will consider the order of collections, the collections need to be sorted before calling sequenceequal. I guess this is not much of a probelm as sorting is pretty fast(nlogn).

UPDATE4 As per Brian's suggestion, I can use a lookup for this.

var originallkup = original.ToLookup(i => i);    
var lookuplist = new List<ILookup<int, int>>
                                    {   duplicate4.ToLookup(i=>  i), 
                                        duplicate3.ToLookup(i=>  i), 
                                        duplicate2.ToLookup(i=>  i),
                                        duplicate1.ToLookup(i=>  i)
                                    };

bool isequal = (lookuplist.Sum(x => x.Count) == (originallkup.Count * 4)) &&       
   (originallkup.All(x => lookuplist.All(i => i[x.Key].Count() == x.Count())));

Thank you all for your responses.

Upvotes: 5

Views: 1227

Answers (3)

Eric Lippert
Eric Lippert

Reputation: 660573

I have a List. I duplicate the List many times and use it for different purposes. At some point I need to check if the contents of all these collections are same.

A commenter then asks:

Is the order important? Or just the content?

And you respond:

only the content is important

In that case you are using the wrong data structure in the first place. Use a HashSet<T>, not a List<T>, to represent an unordered collection of items that must be cheaply compared for set equality.

Once you have everything in hash sets instead of lists, you can simply use their SetEquals method to see if any pair of sets is unequal.

Alternatively: keep everything in lists, until the point where you want to compare for equality. Initialize a hash set from one of the lists, and then use SetEquals to compare that hash set to every other list.

Upvotes: 7

tobias86
tobias86

Reputation: 5029

I honestly can't think of a more efficient solution, but as for reducing the number of lines of code, give this a bash:

var allLists = new List<List<string>>() { original, duplicate1, duplicate2, duplicate3, duplicate4 };

bool allEqual = allLists.All(l => l.SequenceEqual(original));

Or, use the Any operator - might be better in terms of performance.

bool allEqual = !allLists.Any(l => !l.SequenceEqual(original));

EDIT: Confirmed, Any will stop enumerating the source once it determines a value. Thank you MSDN.

EDIT # 2: I have been looking into the performance of SequenceEquals. This guy has a nice post comparing SequenceEquals to a more imperative function. I modified his example to work with List<string> and my findings match his. It would appear that as far as performance is concerned, SequenceEquals isn't high on the list of preferred methods.

Upvotes: 4

Vinicius Ottoni
Vinicius Ottoni

Reputation: 4677

You can use reflection to create a generic comparer, and always use it. Look this thread, has a loot of code that can help you: Comparing two collections for equality irrespective of the order of items in them

Upvotes: 0

Related Questions