Zinger
Zinger

Reputation: 9

await Parallel.ForEachAsync in recursive function

I have recursive function for tree. Can I loop children node with Parallel.ForEachAsync?

private async Task<List<ResponseBase<BatchRowData>>> SaveDepartments(DepartmentTree node,
    string parentUnitGuid, List<ResponseBase<BatchRowData>> allResponses)
{
    if (parentUnitGuid == null)
    {
        return allResponses;
    }
    await Parallel.ForEachAsync(node.children, async (child, cancellationToken) =>
    {
        ResponseBase<BatchRowData> response = new ResponseBase<BatchRowData>();
        //...do something 
        Unit unit = new Unit();
        unit.SerialNum = child.data.DepartmentNumber;
        unit.UnitName = child.data.DepartmentName;
        unit.ParentUnitGuid = parentUnitGuid;
        string unitGuid = await DBGate.PostAsync<string>("organization/SaveUnit", unit);
        if (unitGuid != null)
        {
            response.IsSuccess = true;
            response.ResponseData.ReturnGuid = unitGuid;
            await SaveDepartments(child, unitGuid, allResponses);
        }
        else
        {
            response.IsSuccess = false;
            response.ResponseData.ErrorDescription = "Failed to Save";
        }
        allResponses.Add(response);
    });
    return allResponses;
}

It works. but I wonder if the tree order levels is always saved with Parallel.ForEachAsync. Because in my tree. it must be. Or I should use simple sync foreach?

My application is ASP.NET Core 6.0.

Upvotes: 0

Views: 354

Answers (1)

Theodor Zoulias
Theodor Zoulias

Reputation: 43845

First things first, the List<T> collection is not thread-safe, so you should synchronize the threads that are trying to Add to the list:

lock (allResponses) allResponses.Add(response);

Otherwise the behavior of your program is undefined.

Now regarding the validity of calling the Parallel.ForEachAsync recursively, it doesn't endanger the correctness of your program, but it does present the challenge of how to control the degree of parallelism. In general when you parallelize an operation you want to be able to limit the degree of parallelism, because over-parallelizing can be harmful for both the client and the server. You don't want to exhaust the memory, the CPU or the network bandwidth of either end.

The ParallelOptions.MaxDegreeOfParallelism property limits the parallelism of a specific Parallel.ForEachAsync loop. It does not have recursive effect. So you can't depend on this.

The easier way to solve this problem is to use a SemaphoreSlim as the throttler, and enclose the body of the Parallel.ForEachAsync loop in a try/finally block:

await semaphore.WaitAsync();
try
{
    // The body
}
finally { semaphore.Release(); }

The semaphore should be initialized with the desirable maximum parallelism for both of its arguments. For example if the desirable limit is 5 then:

SemaphoreSlim semaphore = new(5, 5);

Another way of solving this problem would be to have a single non-recursive Parallel.ForEachAsync loop, which is fed with the items of a Channel<(DepartmentTree, string)>, and inside the body of the loop you write the children of each department in the channel. This should be more efficient than nesting multiple Parallel.ForEachAsync loops the one inside the other, but it is also more complex to implement, and more prone to programming errors (bugs). So I'll leave it out from this answer.

Upvotes: 0

Related Questions