TestK
TestK

Reputation: 333

Comparing List<String> with duplicate information using Linq

To compare two List<String> and extract their differences, I use Linq's Except.

i.e.:

Say I want to compare the following two lists for equality using Linq:

List1 = "0,1,2,2,3"
List2 = "0,1,2,3"

List<string> differences1 = List1.Except(List2).ToList();
List<string> differences2 = List2.Except(List1).ToList();

differences1 and differences2 will have no items as 2 exists in both lists, but both lists are NOT equal. I want to be able to extract all differences between the lists, including duplicate information one has that the other does not.

What is the best method of extracting all differences between two List<string> objects?

Upvotes: 3

Views: 1147

Answers (5)

erikH
erikH

Reputation: 2346

You could create duplicates of the list and then remove all that exists in the other:

var diff1 = list1.ToList();
var diff2 = list2.ToList();
diff1.RemoveAll(diff2.Remove);

Upvotes: 0

Servy
Servy

Reputation: 203842

So what you're looking for is an Except that works on bags, not on sets. So if one sequence has 2 copies of an item and you subtract a set with one copy, there should be one copy left, rather than reducing all sequences into distinct sets before performing the subtraction, as Except does.

This makes it slightly less elegant to handle, but it's still not terrible. Rather than having a HashSet to represent the items in the other set, you simply need to have a dictionary mapping the item to the number of copies. Then for each item, if it's in the dictionary, remove one from the count and don't yield it, and if it isn't in the dictionary then it should be yielded.

public static IEnumerable<T> BagDifference<T>(IEnumerable<T> first
    , IEnumerable<T> second)
{
    var dictionary = second.GroupBy(x => x)
        .ToDictionary(group => group.Key, group => group.Count());

    foreach (var item in first)
    {
        int count;
        if (dictionary.TryGetValue(item, out count))
        {
            if (count - 1 == 0)
                dictionary.Remove(item);
            else
                dictionary[item] = count - 1;
        }
        else
            yield return item;
    }
}

Upvotes: 5

Becuzz
Becuzz

Reputation: 6857

You could use Distinct to eliminate the duplicates then do the comparison.

var distinctList1 = List1.Distinct().ToList();
var distinctList2 = List2.Distinct().ToList();

var differences1 = distinctList1.Except(distinctList2).ToList();
var differences2 = distinctList2.Except(distinctList1).ToList();

Upvotes: 0

Zach
Zach

Reputation: 447

You could call .Distinct() on the lists before comparing them:

List<string> differences1 = List1.Distinct().Except(List2).ToList();
List<string> differences2 = List2.Distinct().Except(List1).ToList();

Upvotes: 0

Hogan
Hogan

Reputation: 70528

You could group by the key and then compare the groups using Except()

It would look like this (not tested might have typos):

var groupList1 = List1.GroupBy(x => x).ToList();
var groupList2 = List2.GroupBy(x => x).ToList();

var differences1 = groupList1.Except(groupList2).ToList();
var differences2 = groupList2.Except(groupList1).ToList();

Upvotes: 0

Related Questions