Reputation: 1916
private void GenerateRecords(JobRequest request)
{
for (var day = 0; day < daysInRange; day++)
{
foreach (var coreId in request.CoreIds)
{
foreach (var agentId in Enumerable.Range(0, request.AgentsCount).Select(x => Guid.NewGuid()))
{
for (var copiesDone = 0; copiesDone < request.CopiesToMake; copiesDone++)
{
foreach (var jobInfoRecord in request.Jobs)
{
foreach (var status in request.Statuses)
{
//DoSomeWork();
}
}
}
}
}
}
}
Is there any way to increase performance of many iterations? I really need to have all of this loops, but I was wondering how can I improve(speedup) iterations? Maybe using linq?
Upvotes: -1
Views: 14164
Reputation: 106936
It is true that LINQ does not speed up anything compared to "old school" for loops. Instead LINQ generally has a small cost. However, depending on your problem (which you should measure to better understand it) you may improve performance a lot by parallelizing your solution. And here LINQ can be handy.
You can change all the for loops into a single IEnumerable<T>
created using LINQ:
var query = from day in Enumerable.Range(0, request.daysInRange)
from coreId in request.CoreIds
from agentId in Enumerable.Range(0, request.AgentsCount).Select(x => Guid.NewGuid())
from copiesDone in Enumerable.Range(0, request.CopiesToMake)
from jobInfoRecord in request.Jobs
from status in request.Statuses
select new {
Day = day,
CoreId = coreId,
AgentId = agentId,
CopiesDone = copiesDone,
JobInfoRecord = jobInfoRecord,
Status = status
};
Assuming that the work you are doing in the loop is to create a new object you can project the items using Select
:
var results = query.Select(item => /* do some work and return something */);
However, you can use PLINQ to parallelize the code with a slight change where AsParallel()
is inserted:
var results = query.AsParallel().Select(item => /* do some work and return something */);
If you have a 4 core compute you can expect an almost 4 fold increase in speed as long as the work being done is CPU bound.
Debugging parallel code can be difficult but assuming that whatever problem you are debugging is not related to the code being executed in parallel you can easily turn parallelization off by removing .AsParallel()
. That can be very convenient.
Upvotes: 1
Reputation: 9587
Once you ditch LINQ you're at the mercy of your collection enumerators and the garbage collector.
If you want to squeeze as much performance as possible out of your foreach
loops and you have control over your data structures, make sure you use collection types that have struct enumerators (i.e. List<T>
, ImmutableArray<T>
). Better yet, use plain generic arrays where feasible. Despite a non-struct enumerator they're the fastest collection type in .NET when it comes to raw access speed, at least when building in Release/with optimizations enabled (in which case the compiler emits the same IL for your foreach
loops over arrays as it would for for
loops, thereby cutting down on allocations and method calls normally associated with using types that implement IEnumerable<T>
).
Roslyn has a set of guidelines for hot code paths which are useful in your situation:
- Avoid LINQ.
- Avoid using foreach over collections that do not have a struct enumerator.
Now, the above is valid in any performance critical scenario. I am, however, somewhat skeptical that collection iteration performance is the bottleneck in your particular situation. It is far more likely that DoSomeWork
is taking longer than you'd like. You should profile your GenerateRecords
method call to get the definitive answer as to which bits of code need the most attention.
If you believe that your DoSomeWork
implementation is optimal, consider parallelising your workload.
Provided that your DoSomeWork
implementation is pure and doesn't rely on external mutable state (i.e. class variables) you may be able to parallelise some of your loop iterations via Parallel.For
or Parallel.ForEach
. Your outermost loop looks like a particularly good candidate for that, but you might have to play around with the placing of the parallel loop until you get the desired performance characteristics. As a starting point here's what I would recommend:
private void GenerateRecords(JobRequest request)
{
Parallel.For(0, daysInRange, day =>
{
foreach (var coreId in request.CoreIds)
{
for (var i = 0; i < request.AgentsCount; i++)
{
var agentId = Guid.NewGuid();
for (var copiesDone = 0; copiesDone < request.CopiesToMake; copiesDone++)
{
foreach (var jobInfoRecord in request.Jobs)
{
foreach (var status in request.Statuses)
{
//DoSomeWork();
}
}
}
}
}
});
}
Upvotes: 5
Reputation: 40709
How many times do you need to do //DoSomeWork()
?
Are any of those wasted?
If not, the loops don't matter.
The way you can tell is, run it under the Visual Studio IDE, and while it's running, hit the "Pause" button. Then display the call stack.
Unless //DoSomeWork()
takes less than a few instructions, the pause will land in //DoSomeWork()
.
If you do this 10 times, the fraction of samples landing in that function is roughly the fraction of time it spends.
If 8 out of 10 samples land in that function, and 2 of them land in the loops (most likely the innermost loop), then even if you unrolled the loop or reduced its cost to 0, you wouldn't save more than about 20%.
The thing to spend your effort on is whatever costs the most.
Upvotes: 1