user2665268
user2665268

Reputation: 43

LINQ Distinct on multiple properties both directions

I'm trying filter a list of objects by two of its properties. An object can be removed if there is a duplicate. And if first property has the same value as the second property and second property has the same value as the first property.

Example:

object0: id0=1A, id1=2B
object1: id0=1A, id1=2B
object2: id0=1A, id1=3C
object3: id0=2B, id1=1A
object4: id0=2B, id1=3C
object5: id0=3C, id1=2B

So the following happens: object0 removes object1 and object 3 object4 removes object5

Final Output:

object0: id0=1A, id1=2B
object2: id0=1A, id1=3C
object4: id0=2B, id1=3C 

Right now I have FOR loops doing this, but I was wonder if there was a way to do this with linq? I've tried using Distinct, DistinctBy, and GroupBy. Do I need to make my own compare to complete this?

Upvotes: 4

Views: 770

Answers (2)

Amy B
Amy B

Reputation: 110071

Do it this way.

source
  .GroupBy(x => new {min = Math.Min(x.Id0, x.Id1), max = Math.Max(x.Id0, x.Id1)})
  .Select(g => g.First());

Tested.

    public void SillyTuplesTest()
    {
        List<Tuple<string, int, int>> source = new List<Tuple<string, int, int>>()
        {
            Tuple.Create("object0", 1, 2),
            Tuple.Create("object1",1, 2),
            Tuple.Create("object2",1, 3),
            Tuple.Create("object3",2, 1),
            Tuple.Create("object4",2, 3),
            Tuple.Create("object5",3, 2)
        };

        var result = source
            .GroupBy(x => new { min = Math.Min(x.Item2, x.Item3), max = Math.Max(x.Item2, x.Item3) })
            .Select(g => g.First());

        foreach (Tuple<string, int, int> resultItem in result)
        {
            Console.WriteLine("{0} ({1}, {2})", resultItem.Item1, resultItem.Item2, resultItem.Item3);
        }
    }

Results

object0 (1, 2)
object2 (1, 3)
object4 (2, 3)

For strings, you could use:

source
  .GroupBy(x =>
    string.Compare(x.Id0, x.Id1, false) < 0 ?
    new {min = x.Id0, max = x.Id1} :
    new {min = x.Id1, max = x.Id0})
  .Select(g => g.First());

If you had an unknown number of strings, you could use a HashSet<string> as a key and the SetComparer.

IEqualityComparer<HashSet<string>> comparer = 
  HashSet<string>.CreateSetComparer();

source
  .GroupBy(x => new HashSet<string>(x.GetStrings()), comparer)
  .Select(g => g.First());

Upvotes: 3

Joe Brunscheon
Joe Brunscheon

Reputation: 1989

Create an extension method such as this one:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    var seenKeys = new HashSet<TKey>();
    return source.Where(element => seenKeys.Add(keySelector(element)));
}

Then use it.

See how that works for you.

Upvotes: -1

Related Questions