Reputation: 5688
If I have an async value work
which I want duplicated and executed in parallel so that hardware capabilities for thread execution are exhausted, how would I do that?
E.g., for a short specific example, consider the following silly program, which searches for a random number less than 1001:
let bound = ref System.Int32.MaxValue
let work =
async {
let rand = new System.Random ()
while !bound > 1000 do
let x = rand.Next ()
if x < !bound then
bound := x
}
[| work; work; work |]
|> Async.Parallel
|> Async.RunSynchronously
(Ignore synchronisation issues for bound
.)
Here, I run three workers; the program would be correct for any non-zero number; and presumably more efficient when the number of workers is exactly the number of available cores. How do I change this program so that the number of workers is chosen automatically depending on the number of available cores?
Update. Is using Async.Parallel
and manually indicating the number of threads the right way to parallelise a CPU-bound computation like the above? If not, what is?
Upvotes: 2
Views: 86
Reputation: 11577
One way to determine the number of available cores for a Process
is something like this:
let numberOfAvailableCores () : int =
let p = System.Diagnostics.Process.GetCurrentProcess().ProcessorAffinity
let rec countOnes acc = function
| 0un -> acc
| n ->
let i = int (n &&& 1un)
countOnes (acc + i) (n >>> 1)
countOnes 0 (unativeint p)
I find this more exact than System.Environment.ProcessorCount
as AFAIK fails it to take ProcessorAffinity
into account.
Update: As Parallel doesn't expose a function to invoke an action "in parallel" a possible solution could be something like this:
let numberOfAvailableCores () : int =
let p = System.Diagnostics.Process.GetCurrentProcess().ProcessorAffinity
let rec countOnes acc = function
| 0un -> acc
| n ->
let i = int (n &&& 1un)
countOnes (acc + i) (n >>> 1)
countOnes 0 (unativeint p)
let executeInParallel (a : unit->unit) : unit =
let cores = numberOfAvailableCores ()
let actions =
[|
for x in 1..(cores * 2) -> Action a
|]
Parallel.Invoke actions
A tip when trying to estimate if you have any contention between cores it can be useful to just run on 1 core and compare the result with the "full" core solution. If you have good solution you should see a linear improvement when enabling more cores. A simple way to run on just 1 core is to set the ProcessorAffinity flag
let p = System.Diagnostics.Process.GetCurrentProcess ()
p.ProcessorAffinity <- 1n // Makes this process "single-core"
(I am trying my hardest to resist answering questions you didn't ask but I am still weak from the flu)
PS. F# Async
are great in many ways but they are primarily aimed at solving the Responsiveness problem not the Scalability problem. That means if you are using a lot of composing of Async
workflows you will probably lose valuable clock cycles. The example you posted wouldn't suffer though. For CPU bound problems I tend to go for Parallel
as it employs auto scaling and work stealing to utilize all CPU resources while having low overhead. Task
or hopac
are also good alternatives.
PS. If you want to managing the scaling yourself I believe the rule of thumb is twice the number of cores.
PS. You said ignore the synchronization issues for bound
and that's fair but I just want to point out that if one has a shared resources frequently accessed by all cores one probably won't see much performance gain.
Upvotes: 3