Reputation: 60987
I'm calling a service over HTTP (ultimately using the HttpClient.SendAsync method) from within my code. This code is then called into from a WebAPI controller action. Mostly, it works fine (tests pass) but then when I deploy on say IIS, I experience a deadlock because caller of the async method call has been blocked and the continuation cannot proceed on that thread until it finishes (which it won't).
While I could make most of my methods async I don't feel as if I have a basic understanding of when I'd must do this.
For example, let's say I did make most of my methods async (since they ultimately call other async service methods) how would I then invoke the first async method of my program if I built say a message loop where I want some control of the degree of parallelism?
Since the HttpClient doesn't have any synchronous methods, what can I safely presume to do if I have an abstraction that isn't async
aware? I've read about the ConfigureAwait(false)
but I don't really understand what it does. It's strange to me that it's set after the async invocation. To me that feels as if a race waiting to happen... however unlikely...
WebAPI example:
public HttpResponseMessage Get()
{
var userContext = contextService.GetUserContext(); // <-- synchronous
return ...
}
// Some IUserContextService implementation
public IUserContext GetUserContext()
{
var httpClient = new HttpClient();
var result = httpClient.GetAsync(...).Result; // <-- I really don't care if this is asynchronous or not
return new HttpUserContext(result);
}
Message loop example:
var mq = new MessageQueue();
// we then run say 8 tasks that do this
for (;;)
{
var m = mq.Get();
var c = GetCommand(m);
c.InvokeAsync().Wait();
m.Delete();
}
When you have a message loop that allow things to happen in parallel and you have asynchronous methods, there's a opportunity to minimize latency. Basically, what I want to accomplish in this instance is to minimize latency and idle time. Though I'm actually unsure as to how to invoke into the command that's associated with the message that arrives off the queue.
To be more specific, if the command invocation needs to do service requests there's going to be latency in the invocation that could be used to get the next message. Stuff like that. I can totally do this simply by wrapping up things in queues and coordinating this myself but I'd like to see this work with just some async/await stuff.
Upvotes: 3
Views: 6555
Reputation: 60987
While I appreciate the insight from community members it's always difficult to express the intent of what I'm trying to do but tremendously helpful to get advice about circumstances surrounding the problem. With that, I eventually arrived that the following code.
public class AsyncOperatingContext
{
struct Continuation
{
private readonly SendOrPostCallback d;
private readonly object state;
public Continuation(SendOrPostCallback d, object state)
{
this.d = d;
this.state = state;
}
public void Run()
{
d(state);
}
}
class BlockingSynchronizationContext : SynchronizationContext
{
readonly BlockingCollection<Continuation> _workQueue;
public BlockingSynchronizationContext(BlockingCollection<Continuation> workQueue)
{
_workQueue = workQueue;
}
public override void Post(SendOrPostCallback d, object state)
{
_workQueue.TryAdd(new Continuation(d, state));
}
}
/// <summary>
/// Gets the recommended max degree of parallelism. (Your main program message loop could use this value.)
/// </summary>
public static int MaxDegreeOfParallelism { get { return Environment.ProcessorCount; } }
#region Helper methods
/// <summary>
/// Run an async task. This method will block execution (and use the calling thread as a worker thread) until the async task has completed.
/// </summary>
public static T Run<T>(Func<Task<T>> main, int degreeOfParallelism = 1)
{
var asyncOperatingContext = new AsyncOperatingContext();
asyncOperatingContext.DegreeOfParallelism = degreeOfParallelism;
return asyncOperatingContext.RunMain(main);
}
/// <summary>
/// Run an async task. This method will block execution (and use the calling thread as a worker thread) until the async task has completed.
/// </summary>
public static void Run(Func<Task> main, int degreeOfParallelism = 1)
{
var asyncOperatingContext = new AsyncOperatingContext();
asyncOperatingContext.DegreeOfParallelism = degreeOfParallelism;
asyncOperatingContext.RunMain(main);
}
#endregion
private readonly BlockingCollection<Continuation> _workQueue;
public int DegreeOfParallelism { get; set; }
public AsyncOperatingContext()
{
_workQueue = new BlockingCollection<Continuation>();
}
/// <summary>
/// Initialize the current thread's SynchronizationContext so that work is scheduled to run through this AsyncOperatingContext.
/// </summary>
protected void InitializeSynchronizationContext()
{
SynchronizationContext.SetSynchronizationContext(new BlockingSynchronizationContext(_workQueue));
}
protected void RunMessageLoop()
{
while (!_workQueue.IsCompleted)
{
Continuation continuation;
if (_workQueue.TryTake(out continuation, Timeout.Infinite))
{
continuation.Run();
}
}
}
protected T RunMain<T>(Func<Task<T>> main)
{
var degreeOfParallelism = DegreeOfParallelism;
if (!((1 <= degreeOfParallelism) & (degreeOfParallelism <= 5000))) // sanity check
{
throw new ArgumentOutOfRangeException("DegreeOfParallelism must be between 1 and 5000.", "DegreeOfParallelism");
}
var currentSynchronizationContext = SynchronizationContext.Current;
InitializeSynchronizationContext(); // must set SynchronizationContext before main() task is scheduled
var mainTask = main(); // schedule "main" task
mainTask.ContinueWith(task => _workQueue.CompleteAdding());
// for single threading we don't need worker threads so we don't use any
// otherwise (for increased parallelism) we simply launch X worker threads
if (degreeOfParallelism > 1)
{
for (int i = 1; i < degreeOfParallelism; i++)
{
ThreadPool.QueueUserWorkItem(_ => {
// do we really need to restore the SynchronizationContext here as well?
InitializeSynchronizationContext();
RunMessageLoop();
});
}
}
RunMessageLoop();
SynchronizationContext.SetSynchronizationContext(currentSynchronizationContext); // restore
return mainTask.Result;
}
protected void RunMain(Func<Task> main)
{
// The return value doesn't matter here
RunMain(async () => { await main(); return 0; });
}
}
This class is complete and it does a couple of things that I found difficult to grasp.
As general advice you should allow the TAP (task-based asynchronous) pattern to propagate through your code. This may imply quite a bit of refactoring (or redesign). Ideally you should be allowed to break this up into pieces and make progress as you work towards to overall goal of making your program more asynchronous.
Something that's inherently dangerous to do is to call asynchronous code callously in an synchronous fashion. By this we mean invoking the Wait
or Result
methods. These can lead to deadlocks. One way to work around something like that is to use the AsyncOperatingContext.Run
method. It will use the current thread to run a message loop until the asynchronous call is complete. It will swap out whatever SynchronizationContext
is associated with the current thread temporarily to do so.
Note: I don't know if this is enough, or if you are allowed to swap back the
SynchronizationContext
this way, assuming that you can, this should work. I've already been bitten by the ASP.NET deadlock issue and this could possibly function as a workaround.
Lastly, I found myself asking the question, what is the corresponding equivalent of Main(string[])
in an async
context? Turns out that's the message loop.
What I've found is that there are two things that make out this async
machinery.
SynchronizationContext.Post
and the message loop. In my AsyncOperatingContext
I provide a very simple message loop:
protected void RunMessageLoop()
{
while (!_workQueue.IsCompleted)
{
Continuation continuation;
if (_workQueue.TryTake(out continuation, Timeout.Infinite))
{
continuation.Run();
}
}
}
My SynchronizationContext.Post
thus becomes:
public override void Post(SendOrPostCallback d, object state)
{
_workQueue.TryAdd(new Continuation(d, state));
}
And our entry point, basically the equivalent of an async
main from synchronous context (simplified version from original source):
SynchronizationContext.SetSynchronizationContext(new BlockingSynchronizationContext(_workQueue));
var mainTask = main(); // schedule "main" task
mainTask.ContinueWith(task => _workQueue.CompleteAdding());
RunMessageLoop();
return mainTask.Result;
All of this is costly and we can't just go replace calls to async
methods with this but it does allow us to rather quickly create the facilities required to keep writing async
code where needed without having to deal with the whole program. It's also very clear from this implementation where the worker threads go and how the impact concurrency of your program.
I look at this and think to myself, yeap, that's how Node.js does it. Though JavaScript does not have this nice async/await language support that C# currently does.
As an added bonus, I have complete control of the degree of parallelism, and if I want, I can run my async
tasks completely single threaded. Though, If I do so and call Wait
or Result
on any task, it will deadlock the program because it will block the only message loop available.
Upvotes: 1
Reputation: 456407
While I could make most of my methods async I don't feel as if I have a basic understanding of when I'd must do this.
Start at the lowest level. It sounds like you've already got a start, but if you're looking for more at the lowest level, then the rule of thumb is anything I/O-based should be made async
(e.g., HttpClient
).
Then it's a matter of repeating the async
infection. You want to use async methods, so you call them with await
. So that method must be async
. So all of its callers must use await
, so they must also be async
, etc.
how would I then invoke the first async method of my program if I built say a message loop where I want some control of the degree of parallelism?
It's easiest to put the framework in charge of this. E.g., you can just return a Task<T>
from a WebAPI action, and the framework understands that. Similarly, UI applications have a message loop built-in that async
will work naturally with.
If you have a situation where the framework doesn't understand Task
or have a built-in message loop (usually a Console application or a Win32 service), you can use the AsyncContext
type in my AsyncEx
library. AsyncContext
just installs a "main loop" (that is compatible with async
) onto the current thread.
Since the HttpClient doesn't have any synchronous methods, what can I safely presume to do if I have an abstraction that isn't async aware?
The correct approach is to change the abstraction. Do not attempt to block on asynchronous code; I describe that common deadlock scenario in detail on my blog.
You change the abstraction by making it async
-friendly. For example, change IUserContext IUserContextService.GetUserContext()
to Task<IUserContext> IUserContextService.GetUserContextAsync()
.
I've read about the ConfigureAwait(false) but I don't really understand what it does. It's strange to me that it's set after the async invocation.
You may find my async
intro helpful. I won't say much more about ConfigureAwait
in this answer because I think it's not directly applicable to a good solution for this question (but I'm not saying it's bad; it actually should be used unless you can't use it).
Just bear in mind that async
is an operator with precedence rules and all that. It feels magical at first, but it's really not so much. This code:
var result = await httpClient.GetAsync(url).ConfigureAwait(false);
is exactly the same as this code:
var asyncOperation = httpClient.GetAsync(url).ConfigureAwait(false);
var result = await asyncOperation;
There are usually no race conditions in async
code because - even though the method is asynchronous - it is also sequential. The method can be paused at an await
, and it will not be resumed until that await
completes.
When you have a message loop that allow things to happen in parallel and you have asynchronous methods, there's a opportunity to minimize latency.
This is the second time you've mentioned a "message loop" "in parallel", but I think what you actually want is to have multiple (asynchronous) consumers working off the same queue, correct? That's easy enough to do with async
(note that there is just a single message loop on a single thread in this example; when everything is async, that's usually all you need):
await tasks.WhenAll(ConsumerAsync(), ConsumerAsync(), ConsumerAsync());
async Task ConsumerAsync()
{
for (;;) // TODO: consider a CancellationToken for orderly shutdown
{
var m = await mq.ReceiveAsync();
var c = GetCommand(m);
await c.InvokeAsync();
m.Delete();
}
}
// Extension method
public static Task<Message> ReceiveAsync(this MessageQueue mq)
{
return Task<Message>.Factory.FromAsync(mq.BeginReceive, mq.EndReceive, null);
}
You'd probably also be interested in TPL Dataflow. Dataflow is a library that understands and works well with async
code, and has nice parallel options built-in.
Upvotes: 5