Reputation: 38758
Let's say I have a lazy Tree
whose leaves are possible solutions to a problem
data Tree a = Node [Tree a] | Leaf (Maybe a)
I need to find just one solution (or find out that there are none).
I have a P-core machine. From both time and memory efficiency considerations, it only makes sense to search along P different branches in parallel.
For example, suppose you have four branches of about the same computational complexity (corresponding to T seconds of CPU time), and each of them has an answer.
If you evaluate all four branches truly in parallel on a dual-core machine, then they all will finish in about 2T seconds.
If you evaluate just the first two branches and postpone the other two, then you'll get an answer in only T seconds, also using twice as less memory.
My question is, is it possible to use any of the parallel Haskell infrastructure (Par monad, parallel strategies, ...) to achieve this, or do I have to use lower-level tools like async?
Upvotes: 5
Views: 436
Reputation: 12795
Both Strategies and the Par monad will only start evaluating a new parallel task if there is a CPU available, so in your example with four branches on a 2-core machine, only two will be evaluated. Furthermore, Strategies will GC the other tasks once you have an answer (although it might take a while to do that).
However, if each of those two branches creates more tasks, then you probably wanted to give priority to the newer tasks (i.e., depth-first), but Strategies at least will give priority to the older tasks. The Par monad I think gives priority to the newer ones (but I'd have to check that), however the Par monad will evaluate all the tasks before returning an answer, because that is how it enforces determinism.
So probably the only way to get this to work exactly as you want it, at the moment, is to write a custom scheduler for the Par monad.
Upvotes: 5
Reputation: 15313
At least Par
monad and strategies from parallel
package allow to build only pure, unconditional parallel systems, which look prettily on such pictures:
a / \ b c \ /\ d e \ ...
While in general case you really need impure inter-thread communications:
solve :: Tree a -> Maybe a
smartPartition :: Tree a -> Int -> [[Tree a]]
smartPartition tree P = ... -- split the tree in fairly even chunks,
-- one per each machine core
solveP :: Tree a -> IO (Maybe a)
solveP tree = do
resRef <- newIORef Nothing
results <- parallel (map work (smartPartition tree))
return (msum results)
where work [] = return Nothing
work (t:ts) = do
res <- readIORef resRef
if (isJust res) then (return res) else do
let tRes = solve t
if (isNothing tRes) then (work ts) else do
writeIORef tRes
return tRes
However if your single leaf computations are sufficiently and equally expensive, unsing strategies should not (I'm not sure) harm performance much:
partitionLeafs :: Tree a -> Int -> [[Tree a]]
solveP :: Tree a -> Maybe a
solveP = msum . map step . transpose . partitionLeafs
where step = msum . parMap rdeepseq solve
P. S. I feel I understand field of the problem not better than you (at least), so you likely already know all the above. I've written this answer to develop discussion, because the question is very interesting for me.
Upvotes: 1