stefanobaghino
stefanobaghino

Reputation: 12804

Non-determinism of synchronous execution contexts (a.k.a. `parasitic`)

In Scala, as explained in the PR that introduced it, parasitic allows to steal

execution time from other threads by having its Runnables run on the Thread which calls execute and then yielding back control to the caller after all its Runnables have been executed.

It appears to be a neat trick to avoid context switches when:

  1. you are doing a trivial operation following on a Future coming from an actually long running operation, or
  2. you are working with an API that doesn't allow you to specify an ExecutionContext for a Future but you but you would like to make sure the operation continues on that same thread, without introducing a different threadpool

The PR that originally introduced parasitic further explains that

When using parasitic with abstractions such as Future it will in many cases be non-deterministic as to which Thread will be executing the logic, as it depends on when/if that Future is completed.

This concept is also repeated in the official Scala documentation in the paragraphs about Synchronous Execution Contexts:

One might be tempted to have an ExecutionContext that runs computations within the current thread:

val currentThreadExecutionContext = ExecutionContext.fromExecutor(
  new Executor {
    // Do not do this!
    def execute(runnable: Runnable) { runnable.run() }
})

This should be avoided as it introduces non-determinism in the execution of your future.

Future {
  doSomething
}(ExecutionContext.global).map {
  doSomethingElse
}(currentThreadExecutionContext)

The doSomethingElse call might either execute in doSomething’s thread or in the main thread, and therefore be either asynchronous or synchronous. As explained here a callback should not be both.

I have a couple of doubts:

  1. how is parasitic different from the synchronous execution context in the Scala documentation?
  2. what is the source of non-determinism mentioned in both sources? From the comment in the PR that introduced parasitic it sounds like if doSomething completes very quickly it may return control to the main thread and you may end up actually not running doSomethingElse on a global thread but on the main one. That's what I could make of it, but I would like to confirm.
  3. is there a reliable way to have a computation run on the same thread as its preceding task? I guess using a lazy implementation of the Future abstraction (e.g. IO in Cats) could make this more easy and reliable, but I'd like to understand if this is possible at all using the Future from the standard library.

Upvotes: 3

Views: 564

Answers (1)

Viktor Klang
Viktor Klang

Reputation: 26597

  1. parasitic has an upper bound on stack recursion, to try to mitigate the risk of StackOverflowErrors due to nested submissions, and can instead defer Runnables to a queue.
  2. The source of non-determinisism is: If the Future is not yet completed: it will register to run on the completing thread. if the Future is completed it will run on the registering thread. Since those two situations can depend on timing, it is not deterministic which thread will execute the code.
  3. How do you know A) which Thread that is, and B) whether it would ever be able to execute another task again?

I find that it is easier to think about Futures as read-handles for values that may or may not exist at a specific point in time. That nicely untangles the notion of Threads from the picture, and now it is rather about: When a value is available, I want to do X—and here is the ExecutionContext that will do X.

Upvotes: 4

Related Questions