TheNextman
TheNextman

Reputation: 12566

LINQ GroupBy collection

Is it possible to GroupBy in LINQ, using a collection property?

e.g.

void Main()
{
    var t1 = new Test() { Children = new List<string>() { "one", "two" } };
    var t2 = new Test() { Children = new List<string>() { "one", "two" } };
    var t3 = new Test() { Children = new List<string>() { "one", "three" }        };

    var tests = new List<Test>() { t1, t2, t3 };
    var anon =  from t in tests
                select new
                {
                    Children = t.Children
                };

    anon.GroupBy(t => t.Children).Dump();
}

public class Test
{
    public List<string> Children {get;set;}
}

In this example, I would hope for two groups:

Key: List() { "one", "two" } Value: t1, t2

Key: List() { "one", "three" } Value: t3

My understanding is that anonymous types are compared not by reference, but by comparing equality on their public properties.

However, the actual result is three groups:

Key: List() { "one", "two" } Value: t1

Key: List() { "one", "two" } Value: t2

Key: List() { "one", "three" } Value: t3

If this is not possible, is there a way to get the result I want?

Hopefully explained this clearly...

Upvotes: 8

Views: 2110

Answers (4)

Ani
Ani

Reputation: 113472

The reason you get 3 groups is because List<T> implements equality with default reference-equality, not by considering the "sequence equality" of the contained elements between any two lists. If you want such semantics, you'll have to implement an IEqualityComparer<IList<T>> (or similar) yourself and inject that into the GroupBy query using the overload that accepts an equality-comparer. Here's a sample implementation (for arrays, not lists, but easily adaptable).

If you're comfortable with set equality (order and duplicates are irrelevant), you're in luck: you can directly use HashSet<T> and the provided CreateSetComparer method for the comparer implementation:

  var t1 = new Test { Children = new HashSet<string> { "one", "two" } };
  var t2 = new Test { Children = new HashSet<string> { "one", "two" } };
  var t3 = new Test { Children = new HashSet<string> { "one", "three" } };

  var tests = new List<Test> { t1, t2, t3 };

  // Only two groups: { one, two } and { one, three }
  tests.GroupBy(t => t.Children, HashSet<string>.CreateSetComparer())
       .Dump();

Upvotes: 2

CassOnMars
CassOnMars

Reputation: 6181

The problem is that the lists are not exactly identical. It's comparing equality for grouping, and you have two new List<string>s, which aren't exactly equal. You can, however, join the strings by hash code, which would produce a correct result:

tests.GroupBy(t => String.Join(string.Empty, t.Children.Select(c => c.GetHashCode().ToString())));

Upvotes: 0

wsanville
wsanville

Reputation: 37516

By default, GroupBy is going to use reference equality when grouping by lists (which are reference types).

Since you've got new instances of the list each time, they are not equal.

However, there is an overload of GroupBy which lets you specify a custom IEqualityComparer, so that you can implement your own way of comparing a list of strings, for example.

To implement this, there are many other threads here about comparing two lists.

Upvotes: 4

Yochai Timmer
Yochai Timmer

Reputation: 49269

I don't think there's a built in method to that.

Look at Jon Skeet's answer here:

Any chance to get unique records using Linq (C#)?

Upvotes: 0

Related Questions