frenchie
frenchie

Reputation: 51937

LINQ query with Distinct and Union

I currently have 2 queries that are returning lists of MyModel like this:

var q1 = ....
         select new MyModel()
         {
             TheData1 = ...
             TheData2 = ...
             TheUniqueID = ...
         }

var q2 = ....
         select new MyModel()
         {
             TheData1 = ...
             TheData2 = ...
             TheUniqueID = ...
         }

If in q1 I have:

TheUniqueID = 2,3,6,9,11 

and in q2 I have:

TheUniqueID = 2,4,7,9,12

How do write the query so that I get a list of MyModel where

TheUniqueID = 2,3,4,6,7,9,11,12

In other words, each TheUniqueID is present only once (ie. 2 and 9 not repeated).

I started looking at Union and distinct but I'm wondering if I need 2 from statements or not.

Any suggestions are welcome.

Upvotes: 24

Views: 54625

Answers (4)

OzBob
OzBob

Reputation: 4520

Inefficient single line answer with no IEqualityComparerer

Using MoreLinq source code as inspiration, this will give a unique list:

Short answer (the OrderBy isn't necessary but if not used the answer comes out as 2,3,6,9,11,4,7,12):

var concattedUniqueList = theUniqueIDList1.Concat(theUniqueIDList2)
            .GroupBy(f=>f.UniqueID, f=>f).Select(g => g.First()).OrderBy(f=>f.UniqueID);

Complete answer:

//INPUT
//theUniqueIDList1 = 2,3,6,9,11 
//theUniqueIDList2 = 2,4,7,9,12
//OUTPUT
//2,3,4,6,7,9,11,12
public class MyModel
{
    public string TheData1 { get; set; }
    public string TheData2 { get; set; }
    public int UniqueID { get; set; }
}

public static void GroupByEx1()
    {
        // Create a list of Models.
        List<MyModel> theUniqueIDList1 =
            new List<MyModel>{  new MyModel { TheData1="Barley",    UniqueID=2 },
                                    new MyModel { TheData1="Boots",     UniqueID=3 },
                                    new MyModel { TheData1="Whiskers",  UniqueID=6 },
                                    new MyModel { TheData1="Daisy",     UniqueID=9 },
                                    new MyModel { TheData1="Preti",     UniqueID=11 } };
        List<MyModel> theUniqueIDList2 =
            new List<MyModel>{  new MyModel { TheData1="Barley",    UniqueID=2 },
                                    new MyModel { TheData1="Henry",     UniqueID=4 },
                                    new MyModel { TheData1="Walsh",     UniqueID=7 },
                                    new MyModel { TheData1="Daisy",     UniqueID=9 },
                                    new MyModel { TheData1="Ugly",  UniqueID=12 } };
        
        var concattedUniqueList = theUniqueIDList1.Concat(theUniqueIDList2)
            .OrderBy(f=>f.UniqueID).GroupBy(f=>f.UniqueID, f=>f).Select(g => g.First());

        foreach (var item in concattedUniqueList)
        {
            Console.WriteLine("UniqueId: {0}({1})", item.UniqueID, item.TheData1);
        }
    }
    
void Main()
{
    GroupByEx1();               
    //2,3,4,6,7,9,11,12
}

Note: compared to using an IEqualityComparer for speed - 10000 times for each 698 ns for Concat 100 ns for IEqualityComparer

developed in LinqPad

Upvotes: 4

Thomas Li
Thomas Li

Reputation: 3338

I think frenchie wants a list of MyModel back instead of just the TheUniqueID.

You need to create a MyModelTheUniqueIDComparer class and pass a new instance of it as a second argument into Union:

class MyModelTheUniqueIDComparer : IEqualityComparer<MyModel>
{
    public bool Equals(MyModel x, MyModel y)
    {
        return x.TheUniqueID == y.TheUniqueID;
    }

    // If Equals() returns true for a pair of objects 
    // then GetHashCode() must return the same value for these objects.

    public int GetHashCode(MyModel myModel)
    {
        return myModel.TheUniqueID.GetHashCode();
    }
}

Then you can call to get the result:

var result = q1.Union(q2, new MyModelTheUniqueIDComparer());

See http://msdn.microsoft.com/en-us/library/bb358407.aspx for a more details.

Update:

Try this:

public class A
{
    public string TheData1 { get; set; }
    public string TheData2 { get; set; }
    public string UniqueID { get; set; }
}

public class AComparer : IEqualityComparer<A>
{

    #region IEqualityComparer<A> Members

    public bool Equals(A x, A y)
    {
        return x.UniqueID == y.UniqueID;
    }

    public int GetHashCode(A obj)
    {
        return obj.UniqueID.GetHashCode();
    }

    #endregion
}

And test with this:

var listOfA = new List<A>();
var q1 = from a in listOfA
                 select new A()
             {
                 TheData1 = "TestData",
                 TheData2 = "TestData",
                 UniqueID = a.UniqueID
             };

var anotherListOfA = new List<A>();
var q2 = from a in anotherListOfA
                 select new A()
                 {
                     TheData1 = "TestData",
                     TheData2 = "TestData",
                     UniqueID = a.UniqueID
                 };

q1.Union(q2, new AComparer());

Make sure you have using System.Linq;

Upvotes: 29

BrokenGlass
BrokenGlass

Reputation: 160902

As was pointed out if you are combining the lists with .Union() you will have to define uniqueness by using the overload passing an IEqualityComparer for your type.

var result = q1.Union(q2, myEqualityComparer);

otherwise, and easier you could use DistinctBy( x=> x.TheUniqueId) from the MoreLinq project:

var result = q1.Concat(q2).DistinctBy(c => c.TheUniqueID);

Upvotes: 6

vlad
vlad

Reputation: 4778

Union creates an Enumerable with unique values from both collections. In other words, you don't need Distinct.

edit: example of Union here

edit2: forgot that it's not the list of UniqueIDs that you're concatenating. I removed the suggested code since it was wrong. You should be able to do a simple Union if you implement an IEqualityComparer, but that might be overkill.

Upvotes: 14

Related Questions