Reputation: 4585
I'm reading a file and turning each line within it into a class, let's call it Record
, and returning each Record
as it is read using IEnumerable<Record>
and yield return
.
Because of this I only start actually performing these reads whenever I do an operation on the enumeration, such as performing a sum on it or iterating through it with a foreach
.
I do need to go through each record and then translate that into a database, but due to database design before my time I need the totals on each record in the database, so I need these totals before I start translating them into my database.
At the moment I have five separate .Count()
or .Sum()
operations on my enumeration before I start iterating the enumeration (example int i = records.Sum(r => r.SomeField)
or int j = records.Count(r => r.IsSomethingTrue)
). Each one of those counts or sums will loop through the entire file to calculate each one separately. I'm not really happy with this behaviour and would like to find a more efficient way of doing this.
I am using .NET 3.5 if that makes any difference.
Upvotes: 2
Views: 87
Reputation: 8197
You could use your own struct
to calculate a few values at the single pass through an enumerable object.
public struct ComplexAccumulator
{
public int TotalSumField { get; set; }
public int CountSomethingTrue { get; set; }
}
Now you can use Aggreagate
extension method to accumulate values:
records.Aggregate(default(ComplexAccumulator), (a, r) => new ComplexAccumulator
{
TotalSumFiled = a.TotalSumField + r.SumField,
CountSomethingTrue = a.CountSomethingTrue + r.IsSomethingTrue ? 1 : 0,
});
Instead of the struct
you could use suitable Tuple
instance, f.e. something like Tuple<int, int, int>
.
Upvotes: 1
Reputation: 171178
Efficiency is not a strength of LINQ... You need to replace some LINQ things with manual loops here.
You seem to need two passes over the data. One for aggregation:
var sum = 0; //etc.
foreach (var item in items) {
//compute all 5 aggregates here
}
And then one to translate the data:
items.Select(item => Translate(item, aggregates))
Whether you should buffer items
(for example using ToList
) or not depends on whether available memory can hold those items or not.
You can use Aggregate
to perform all 5 aggregations in one pass but that's not better than a loop in any way. It's slower, far more code and the code arguably is illegible.
Upvotes: 0