Reputation: 18624
I have a list of 500.000 to 1.000.000 instances of MyClass
, which has these properties:
class MyClass
{
string ParentId;
string Name;
DateTime StartDate;
DateTime EndDate;
}
The data could look like this:
ParentId | Name | StartDate | EndDate
----------------------------------------------
parent1 | alpha | 01-01-2011 | 02-02-2015
parent1 | beta | 01-01-2011 | 02-02-2014
parent2 | gamma | 01-01-2012 | 02-02-2011
I need to filter the list so it contains the "alpha" and "gamma" objects. The "beta" object should be excluded because it has the same parent as alpha, but an earlier EndDate.
I.e. the resulting list should only contain one instance per ParentId (the one with the latest EndDate).
The filtering needs to perform well.
Upvotes: 2
Views: 117
Reputation: 96561
While the currently accepted answer (by @Kobi) is correct and is probably the simplest solution, it might not be the "best" solution.
Especially, since you mentioned that you might have quite a lot of items in the list and that the solution should perform well, I thought I'd check how a solution without LINQ performs.
This is my solution:
var tempDict = new Dictionary<string, MyClass>();
foreach (var data in list) // list is the List<MyClass>
{
MyClass existing;
if (!tempDict.TryGetValue(data.ParentId, out existing))
{
// Put item into temp dictionary (use ParentId as key)
tempDict[data.ParentId] = data;
}
else
{
// Check if the instance in the temp dictionary has an
// earlier EndDate. If yes, replace it.
if (existing.EndDate < data.EndDate) // replace
tempDict[data.ParentId] = data;
}
}
var result = tempDict.Values.ToList();
A quick comparison (using 500.000 items) showed that this solution is about 3 to 4 times faster than the LINQ-version (depending on the number of unique ParentId values).
Upvotes: 2
Reputation: 9564
I assume you want to filter out beta for the reasons explained and not for its bare name. Here's what you can use to achieve such result:
myClasses.GroupBy(i => i.ParentId)
.Select(i => i.OrderByDescending(i2 => i2.EndDate).First());
Upvotes: 2
Reputation: 519
You can use it, this method work fine and fast with large array:
var groupesList = yourList.GroupBy(x => x.ParentId,
(y, set) => new {Key = y, Value = set.First(s => s.EndDate == set.Max(r => r.EndDate))}).Select(x => x.Value).ToList();
Upvotes: 0
Reputation: 138017
You can use GroupBy
and Select
:
var filtered = list
.GroupBy(mc=>mc.ParentId)
.Select(g=>g.OrderByDescending(mc=>mc.EndDate).First())
.ToList();
Upvotes: 5
Reputation: 19646
You can easily filter a List<T>
using Linq.Where
var result = myList
.Where(item => item.Name == "gamma" || item.Name == "alfa")
.ToList();
If you want to distinct the output on a certain feild, you can either use MoreLinq's DistinctBy
Or GroupBy
:
var result = myList
.Where(item => item.Name == "gamma" || item.Name == "alfa")
.GroupBy(item => item.ParentId)
.Select(g => g.First()) //Selection logic
.ToList();
Upvotes: 0