matt
matt

Reputation: 2039

Haskell : parallel list computation

I am trying to follow the implementation of the countdown problem shown in this video (https://www.youtube.com/watch?v=rlwSBNI9bXE&) and I thought it would be a good problem to try and run in parallel?

data Op =
    Add
  | Sub
  | Mul
  | Div deriving (Eq)

data Expr = 
    Val Int 
  | App Op Expr Expr

--{ helper functions }  

solutions' :: [Int] -> Int -> [Expr]
solutions' ns n =
  [e | ns' <- choices ns, (e, m) <- results ns', m == n]

I tried following some other posts on how to do this and came up with something like this

instance NFData Op where
  rnf Add = Add `deepseq` ()
  rnf Sub = Sub `deepseq` ()
  rnf Mul = Mul `deepseq` ()
  rnf Div = Div `deepseq` ()

instance NFData Expr where
  rnf (Val a)     = a `deepseq` ()
  rnf (App o l r) = (rnf  o) `deepseq` (rnf l) `deepseq` (rnf r)

solutions' :: [Int] -> Int -> [Expr]
solutions' ns n =
  ([e | ns' <- choices ns, (e, m) <- results ns', m == n]
    `using` parList rdeepseq)

It compiles but the program crashes when i try and run it. To be honest I was really just guessing on what I wrote.

How do I get this to run in parallel?

when I run in GHCI

>λ= r = (solutions' [1,3,7,10,25,50] 765)
(0.00 secs, 0 bytes)
>λ= mapM_ print r
*** Exception: stack overflow
>λ=

if i compile with ghc ./Countdown.hs +RTS -N8 -s

and then run the executable, it does not terminate.

Upvotes: 1

Views: 300

Answers (1)

HTNW
HTNW

Reputation: 29193

Ok, so I just clicked at a random timestamp in the video, and by sheer luck I got a slide that describes what's wrong.

For our example, only about 5 million of the 33 million possible expressions are valid.

So, this means that you are evaluating

_fiveMillionList `using` parList rdeepseq

Now, the way (`using` parList _strat) works is that it immediately forces the entire spine of the list. When you begin evaluating your expression, parList forces all the cells of the list to exist. Further, as @DavidFletcher notes, your parallelism is actually useless. Because the filtration is underneath the using, forcing the entire spine of the list also forces all 33 million Exprs to exist, because you need to know how many elements passed the (==) test, so you need to create the Exprs to test them. They don't need to all exist simultaneously, but, in the end, 5 million of them (not counting the Exprs recursively contained in them), plus 5 million (:) constructors, will be held in memory. To add insult to injury, you proceed to create 5 million more objects in the form of dud sparks. And, all of this is being orchestrated by 5 million calls to the Eval monad's (>>=) function. I'm not sure which one of these exactly is sitting resident in memory for long enough to cause a stack overflow, but I'm fairly sure that parList is the culprit.

Perhaps try a more reasonable Strategy. I think you are pretty much forced into using parBuffer, because you need laziness. Using parBuffer n strat, if you evaluate a (:)-cell, then the strategy ensures that the next n - 1 elements have been sparked. So, essentially, it "runs ahead" of any consumer that starts at the head of the list, maintaining a buffer of parallely-evaluated elements. Something like parBuffer 1000 rdeepseq should be fine.


Your NFData instances could use some work. They aren't the problem, but they don't really demonstrate a sound understanding of how evaluation works. I'll just leave them here:

instance NFData Op where
  -- (seq/deepseq) x y is defined by
  -- "if you want to evaluate (seq/deepseq) x y to WHNF, then you must
  -- evaluate x to WHNF/NF, then evaluate y to WHNF."
  -- but e.g. Add is already in WHNF and NF, so seq Add and deeqseq Add are no-ops
  -- the actual evaluation is already finished by the case in rnf's equations
  -- you could even write rnf x = x `seq` (), but I think it's best to be explicit
  rnf Add = ()
  rnf Sub = ()
  rnf Mul = ()
  rnf Div = ()

instance NFData Expr where
  rnf (Val a)     = a `deepseq` ()
  -- rnf o, rnf l :: ()
  -- WHNF and NF are the same thing for the type (); all constructors are nullary
  -- therefore (deepseq (rnf x) y) = seq (rnf x) y
  -- but then seq (rnf x) y = deepseq x y {by definition}
  rnf (App o l r) = o `deepseq` l `deepseq` rnf r

Upvotes: 1

Related Questions