Brian Reischl
Brian Reischl

Reputation: 7356

Chaining Task.WhenAll()

I've come up with a bit of code that chains together multiple calls to Task.WhenAll(). I think this works, but it looks a little bit funny. The purpose is to allow all the Tasks to complete before shutting down a service. Pseudo-code eliding some looping, method declarations, etc...

//initialize once on startup
Task _completion = Task.FromResult(0); 

//Every minute a timer fires and we start some tasks     
// and then chain them into the existing Task
var newTasks = Enumerable.Range(0, 10).Select(_ => Task.Factory.StartNew(() => {/* long running stuff here */})).ToArray();
_completion = Task.WhenAll(_completion, Task.WhenAll(newTasks));

//At some point a shutdown will be requested, and we can just wait for the one Task to complete
_completion.Wait();

Is this a bad idea for any reason? Am I going to end up holding a reference to every Task so they can never get garbage collected, or cause some internal array to become huge, or some other terrible thing?

It feels a little weird to me to repeatedly take the result from Task.WhenAll() and feed it back into Task.WhenAll(). I have taken a look at the source code for Task.WhenAll(), and I don't see anything that indicates this could be a problem. But I'm certainly no expert on the topic.

Upvotes: 3

Views: 1105

Answers (2)

antak
antak

Reputation: 20759

Am I going to end up holding a reference to every Task so they can never get garbage collected

It depends.

A single Task.WhenAll(X) will free up the reference to all tasks in X at the moment every element in X is complete1. Put another way, if you had Task.WhenAll(A, Task.WhenAll(B)), references to B will not be held after it's completed, even if A is not complete. Therefore, as long as tasks in deeper levels keep completing, they should continue to get dropped off.

Note however, if you had a single task way down deep get "stuck" (i.e. never completes). You'll end up a with a chain the continues to grow endlessly.

The way you're adding to the chain (e.g. chain = Task.WhenAll(chain, Task.WhenAll(newTasks))) goes to alleviate the problem, because the inner Task.WhenAll() can still free up tasks even if chain itself is stuck and growing.

On the other hand, the code in the answer posted by Servy doesn't suffer this problem.

1Sourced from reference source (Task.cs):

private sealed class WhenAllPromise : Task<VoidTaskResult>, ITaskCompletionAction
{
    public void Invoke(Task completedTask)
    {
        ...
        // Decrement the count, and only continue to complete the promise if we're the last one.
        if (Interlocked.Decrement(ref m_count) == 0)
        {
            ...
            for (int i = 0; i < m_tasks.Length; i++)
            {
                ...
                // Regardless of completion state, if the task has its debug bit set, transfer it to the
                // WhenAll task.  We must do this before we complete the task.
                if (task.IsWaitNotificationEnabled) this.SetNotificationForWaitCompletion(enabled: true);
                else m_tasks[i] = null; // avoid holding onto tasks unnecessarily
            }
        }
    }
}

Upvotes: 0

Servy
Servy

Reputation: 203820

Am I going to end up holding a reference to every Task so they can never get garbage collected

Task.WhenAll frees up the memory for all of the tasks when all of them finish. This means that any given uncompleted task results in the memory being held onto for all of the other tasks in the same "batch", an every batch "above" it. If your batch sizes are particularly large, and have pretty wide variance in how long they take to complete, that could be a problem. If that's not the case for you, than your code should be fine.

Fortunately, this problem can be optimized rather easily. You can use a class in which you add every active task to a set of tasks, and then remove every task when it finishes. You can then easily wait on every currently active task. This ensures that completed tasks don't have reference to them held onto. Not only does this mean not holding onto older classes for longer than necessary, but it separates out the logic of "holding onto all active tasks" into one place, thus simplifying the logic in your main application. The memory optimization aside, it may improve code clarity.

public class ActiveTaskTracker
{
    private HashSet<Task> tasks = new HashSet<Task>();
    public void Add(Task task)
    {
        if (!task.IsCompleted)//short circuit as an optimization
        {
            lock (tasks)
                tasks.Add(task);
            task.ContinueWith(t => { lock (tasks)tasks.Remove(task); });
        }
    }
    public Task WaitAll()
    {
        lock (tasks)
            return Task.WhenAll(tasks.ToArray());
    }
}

Upvotes: 5

Related Questions