Reputation: 24403
I have two collections of objects of different type. Lets call them type ALPHA and type BRAVO. Each of these types has a property that is the "ID" for the object. No ID is duplicated within the class, so for any given ID, there is at most one ALPHA and one BRAVO instance. What I need to do is divide them into 3 categories:
In all 3 cases, I need to have the actual objects from the collections at hand for subsequent manipulation.
I know for the #3 case, I can do something like:
var myCorrelatedItems = myAlphaItems.Join(myBravoItems, alpha => alpha.Id, beta => beta.Id, (inner, outer) => new
{
alpha = inner,
beta = outer
});
I can also write code for the #1 and #2 cases which look something like
var myUnmatchedAlphas = myAlphaItems.Where(alpha=>!myBravoItems.Any(bravo=>alpha.Id==bravo.Id));
And similarly for unMatchedBravos. Unfortunately, this would result in iterating the collection of alphas (which may be very large!) many times, and the collection of bravos (which may also be very large!) many times as well.
Is there any way to unify these query concepts so as to minimize iteration over the lists? These collections can have thousands of items.
Upvotes: 4
Views: 597
Reputation: 113412
If you are only interested in the IDs,
var alphaIds = myAlphaItems.Select(alpha => alpha.ID);
var bravoIds = myBravoItems.Select(bravo => bravo.ID);
var alphaIdsNotInBravo = alphaIds.Except(bravoIds);
var bravoIdsNotInAlpha = bravoIds.Except(alphaIds);
If you want the alphas and bravos themselves,
var alphaIdsSet = new HashSet<int>(alphaIds);
var bravoIdsSet = new HashSet<int>(bravoIds);
var alphasNotInBravo = myAlphaItems
.Where(alpha => !bravoIdsSet.Contains(alpha.ID));
var bravosNotInAlpha = myBravoItems
.Where(bravo => !alphaIdsSet.Contains(bravo.ID));
EDIT: A few other options:
ExceptBy
method from MoreLinq.Enumerable.ToDictionary
method.IHasId
interface), you could write your own IEqualityComparer<T>
implementation; Enumerable.Except
has an overload that accepts an equality-comparer as a parameter.Upvotes: 2
Reputation: 8163
I think LINQ is not the best answer to this problem if you want to traverse and compare the minimum amount of times. I think the following iterative solution is more performant. And I believe that code readability doesn't suffer.
var dictUnmatchedAlphas = myAlphaItems.ToDictionary(a => a.Id);
var myCorrelatedItems = new List<AlphaAndBravo>();
var myUnmatchedBravos = new List<Bravo>();
foreach (Bravo b in myBravoItems)
{
var id = b.Id;
if (dictUnmatchedAlphas.ContainsKey(id))
{
var a = dictUnmatchedAlphas[id];
dictUnmatchedAlphas.Remove(id); //to get just the unmatched alphas
myCorrelatedItems.Add(new AlphaAndBravo { a = a, b = b});
}
else
{
myUnmatchedBravos.Add(b);
}
}
Definition of AlphaAndBravo:
public class AlphaAndBravo {
public Alpha a { get; set; }
public Bravo b { get; set; }
}
Upvotes: 0
Reputation: 2752
Here is one possible LINQ solution that performs a full outer join on both sets and appends a property to them showing which group they belong to. This solution might lose its luster, however, when you try to separate the groups into different variables. It all really depends on what kind of actions you need to perform on these objects. At any rate this ran at (I thought) an acceptable speed (.5 seconds) for me on lists of 5000 items:
var q =
from g in
(from id in myAlphaItems.Select(a => a.ID).Union(myBravoItems.Select(b => b.ID))
join a in myAlphaItems on id equals a.ID into ja
from a in ja.DefaultIfEmpty()
join b in myBravoItems on id equals b.ID into jb
from b in jb.DefaultIfEmpty()
select (a == null ?
new { ID = b.ID, Group = "Bravo Only" } :
(b == null ?
new { ID = a.ID, Group = "Alpha Only" } :
new { ID = a.ID, Group = "Both" }
)
)
)
group g.ID by g.Group;
You can remove the 'group by' query or create a dictionary from this (q.ToDictionary(x => x.Key, x => x.Select(y => y))
), or whatever! This is simply a way of categorizing your items. I'm sure there are better solutions out there, but this seemed like a truly interesting question so I thought I might as well give it a shot!
Upvotes: 1
Reputation: 110111
Dictionary<int, Alpha> alphaDictionary = myAlphaItems.ToDictionary(a => a.Id);
Dictionary<int, Bravo> bravoDictionary = myBravoItems.ToDictionary(b => b.Id);
ILookup<string, int> keyLookup = alphaDictionary.Keys
.Union(bravoDictionary.Keys)
.ToLookup(x => alphaDictionary.ContainsKey(x) ?
(bravoDictionary.ContainsKey(x) ? "both" : "alpha") :
"bravo");
List<Alpha> alphaBoth = keyLookup["both"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoBoth = keyLookup["both"].Select(x => bravoDictionary[x]).ToList();
List<Alpha> alphaOnly = keyLookup["alpha"].Select(x => alphaDictionary[x]).ToList();
List<Bravo> bravoOnly = keyLookup["bravo"].Select(x => bravoDictionary[x]).ToList();
Upvotes: 1
Reputation: 131686
Sometimes LINQ is not the answer. This is the kind of problem where I would consider using a HashSet<T>
with a custom comparer to reduce the work of performing set operations. HashSets are much more efficient at performing set operations than lists - and (depending on the data) can reduce the work considerably:
// create a wrapper class that can accomodate either an Alpha or a Bravo
class ABItem {
public Object Instance { get; private set; }
public int Id { get; private set; }
public ABItem( Alpha a ) { Instance = a; Id = a.Id; }
public ABItem( Bravo b ) { Instance = b; Id = b.Id; }
}
// comparer that compares Alphas and Bravos by id
class ABItemComparer : IComparer {
public int Compare( object a, object b ) {
return GetId(a).Compare(GetId(b));
}
private int GetId( object x ) {
if( x is Alpha ) return ((Alpha)x).Id;
if( x is Bravo ) return ((Bravo)x).Id;
throw new InvalidArgumentException();
}
}
// create a comparer based on comparing the ID's of ABItems
var comparer = new ABComparer();
var hashAlphas =
new HashSet<ABItem>(myAlphaItems.Select(x => new ABItem(x)),comparer);
var hashBravos =
new HashSet<ABItem>(myBravoItems.Select(x => new ABItem(x)),comparer);
// items with common IDs in Alpha and Bravo sets:
var hashCommon = new HashSet<Alpha>(hashAlphas).IntersectWith( hashSetBravo );
hashSetAlpha.ExceptWith( hashSetCommon ); // items only in Alpha
hashSetBravo.ExceptWith( hashSetCommon ); // items only in Bravo
Upvotes: 1