Reputation: 7728
I have a set of problems that I would like to evaluate in parallel. These problems are expressed using a simple expression type very similar to this:
-- Expressions are either a constant value or two expressions
-- combined using a certain operation
data Expr
= Const NumType
| Binary BinOp Expr Expr
-- The possible operations
data BinOp = Add | Sub | Mul | Div
deriving (Eq)
These expressions are built on the fly and should evaluate to a certain result which may be valid or invalid. This is expressed as a monad to stop computation when encountering invalid results.
data Result a
= Val { val :: a }
| Exc { exc :: String }
instance Monad Result where
return = Val
(Exc e) >>= _ = (Exc e)
(Val v) >>= g = g v
To determine a value of each solved problem I have two relevant functions:
eval :: Expr -> Result NumType
score :: Expr -> NumType
And finally I have solve functions that will return a [Expr]
. This leads to my main function currently looking like this:
main :: IO ()
main = do
strAvailableNumbers <- getLine
strTargetNumber <- getLine
let numbers = parseList strAvailableNumbers
target = parseTargetNumber strTargetNumber in
sequence $ map (print) $
solveHeuristic1 (Problem target numbers) [Add] [Sub] ++
solveHeuristic2 (Problem target numbers)
return ()
The basic idea is that I read a list of numbers and a target number from stdin and then print expressions on stdout.
But I have two problems that I would like to solve and I am not quite sure how related they are:
Those heuristics run entirely unaware of each other and therefore don't know whether the score
of their solution is higher than any other. I would like to introduce some kind of state to the map function to only print the the new Expr
if its score is higher then the Expr
printed previously.
I would like to do these computations in parallel and attempted to do so by using (parMap rseq)
instead of map
, compiling with the -threaded
option and running it using +RTS -N2
. The result is a runtime increase from 5 seconds to 7 seconds. Not what I expected, altough time
shows the CPU utilization is higher. I guess I am not correctly using parMap
or do something wrong by using ++
. So how would I run a list of independent functions, each returning a list of elements, in parallel?
Update: Created a gist with complete source code.
Upvotes: 2
Views: 266
Reputation: 53901
The problem here is that evaluating an IO
action with seq
does approximately nothing. So you're just running things sequentially with slightly more overhead.
You can refractor things to make them pure again
main :: IO ()
main = do
mapM_ (`seq` print "found it") -- make sure we're not
-- benchmarking printing stuff
. concat
. parMap rdeepseq (solve [1..10000000])
$ [42, 42]
return ()
And add instances of NFData
to use rdeepseq
which will fully evaluate things
instance NFData BinOp -- Binop is just an enum, WHNF = NF
instance NFData Expr where
rnf (Const a) = a `deepseq` ()
rnf (Binary b e1 e2) = b `deepseq` e1 `deepseq` e2 `deepseq` ()
And now if we run it we get... a stackoverflow. I bumped up the size sufficiently that we search in order to actually make it take long enough to be worth benchmarking and now fully loading both structures into memory will blow the stack. Bumping up the stack size to the point where we don't blow everything up leaves us running 40% faster (3 vs 5 seconds) using -N2
than without. Which I would consider the expected result. Visually when running this, I can see 2 cores briefly jump up to 100%.
Final compilation sequence
> ghc -O2 -threaded -rtsops bench.hs
> ./bench +RTS -K10000000 -N2
Upvotes: 3