Søren Debois
Søren Debois

Reputation: 5688

How to parallelise symmetric workers?

If I have an async value work which I want duplicated and executed in parallel so that hardware capabilities for thread execution are exhausted, how would I do that?

E.g., for a short specific example, consider the following silly program, which searches for a random number less than 1001:

let bound = ref System.Int32.MaxValue
let work = 
  async {
    let rand = new System.Random () 
    while !bound > 1000 do
      let x = rand.Next ()
      if x < !bound then
        bound := x
  }

[| work; work; work |] 
|> Async.Parallel
|> Async.RunSynchronously

(Ignore synchronisation issues for bound.)

Here, I run three workers; the program would be correct for any non-zero number; and presumably more efficient when the number of workers is exactly the number of available cores. How do I change this program so that the number of workers is chosen automatically depending on the number of available cores?

Update. Is using Async.Parallel and manually indicating the number of threads the right way to parallelise a CPU-bound computation like the above? If not, what is?

Upvotes: 2

Views: 86

Answers (1)

One way to determine the number of available cores for a Process is something like this:

let numberOfAvailableCores () : int =
    let p = System.Diagnostics.Process.GetCurrentProcess().ProcessorAffinity

    let rec countOnes acc = function
        | 0un -> acc
        | n -> 
            let i = int (n &&& 1un)
            countOnes (acc + i) (n >>> 1)

    countOnes 0 (unativeint p)

I find this more exact than System.Environment.ProcessorCount as AFAIK fails it to take ProcessorAffinity into account.

Update: As Parallel doesn't expose a function to invoke an action "in parallel" a possible solution could be something like this:

let numberOfAvailableCores () : int =
    let p = System.Diagnostics.Process.GetCurrentProcess().ProcessorAffinity

    let rec countOnes acc = function
        | 0un -> acc
        | n -> 
            let i = int (n &&& 1un)
            countOnes (acc + i) (n >>> 1)

    countOnes 0 (unativeint p)

let executeInParallel (a : unit->unit) : unit =
    let cores = numberOfAvailableCores ()

    let actions = 
        [|
            for x in 1..(cores * 2) -> Action a
        |]

    Parallel.Invoke actions

A tip when trying to estimate if you have any contention between cores it can be useful to just run on 1 core and compare the result with the "full" core solution. If you have good solution you should see a linear improvement when enabling more cores. A simple way to run on just 1 core is to set the ProcessorAffinity flag

let p = System.Diagnostics.Process.GetCurrentProcess ()
p.ProcessorAffinity <- 1n // Makes this process "single-core"

(I am trying my hardest to resist answering questions you didn't ask but I am still weak from the flu)

PS. F# Async are great in many ways but they are primarily aimed at solving the Responsiveness problem not the Scalability problem. That means if you are using a lot of composing of Async workflows you will probably lose valuable clock cycles. The example you posted wouldn't suffer though. For CPU bound problems I tend to go for Parallel as it employs auto scaling and work stealing to utilize all CPU resources while having low overhead. Task or hopac are also good alternatives.

PS. If you want to managing the scaling yourself I believe the rule of thumb is twice the number of cores.

PS. You said ignore the synchronization issues for bound and that's fair but I just want to point out that if one has a shared resources frequently accessed by all cores one probably won't see much performance gain.

Upvotes: 3

Related Questions