mattematt
mattematt

Reputation: 545

Why does my Haskell code not appear to run in Parallel

I am trying to solve a 2-sum algorithm problem for Standford university online course on coursera. I need to find all distinct pairs x+y in a list that sum to a value t in a range [-10000 .. 10000]. I know there more efficient implementations but I thought it would be a good time to try and do some Haskell parallel programming.

I have tried to implement parellelisation just by looping through half of the range in two different threads (which I think are called sparks). My code is the following:

module Main where

import Data.List
import qualified Data.Map as M
import Debug.Trace
import Control.Parallel (par,pseq)

main :: IO ()
main = interact run

range :: [Int]
range = [negate 10000..10000]

emptyMap :: M.Map Int Bool
emptyMap = M.fromList $ zip [] []

run :: String -> String
run xs = let parsedInput = map (read :: String -> Int) $ words xs
             hashMap = M.fromList $ zip parsedInput (repeat True)
             pcalc r = map (\t -> trace (show t) (countVals hashMap parsedInput t)) r
             bot = pcalc (take (div (length range) 2) range)
             top = pcalc (drop (div (length range) 2) range)
             out = top `par` bot `pseq` (sum bot + sum top)
          in show out

countVals :: M.Map Int Bool -> [Int] -> Int -> Int
countVals m ks t = foldl' go 0 ks
  where go acum x = if M.lookup y m == Just True
                            && y /= x
                            then 1
                            else acum
                              where y = t - x

You can see I have two variables top and bot which I am trying to calculate in parallel via

out = top `par` bot `pseq` (sum bot + sum top)

which is what I thought other stack overflow answers are recommending. However when I compile and run I only seem to see the trace from the bot variable.

% stack ghc --package parallel -- -threaded Main.hs                                           
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...
% ./Main +RTS -N8 < input.txt                                                                   
-10000
-9999
-9998
-9997
-9996
...

Whereas I was expecting something like:

% ./Main +RTS -N8 < input.txt                                                                    
-10000
0
-9999
1
-9998
2
-9997
-9996
...

Can someone help point out what exactly I am doing wrong? Thanks

Upvotes: 3

Views: 163

Answers (1)

chi
chi

Reputation: 116139

Let's focus on this part:

bot = pcalc (take (div (length range) 2) range)
top = pcalc (drop (div (length range) 2) range)
out = top `par` bot `pseq` (sum bot + sum top)

Here, bot and top are lists. When we seq, pseq or par a value we cause it to be evaluated; since Haskell is lazy, evaluation stops when the "weak head normal form" is reached, i.e. until the first constructor appears in the result. For list values, this means that they are reduced to either [] or unevaluatedHead : unevaluatedTail.

Because of this, top `par` bot `pseq` ... only parallelizes the evaluation of the first cell of the lists, and not their full contents. The whole lists will only get evaluated after pseq when we sum them, but that is run on only one core.

To force the code to be parallel, we can parallelize the sums instead:

sumBot = sum bot
sumTop = sum top
out = sumBot `par` sumTop `pseq` sumBot + sumTop

Since evaluating the sums to WHNF requires evaluating the whole list, this should properly parallelize the computation.

Upvotes: 3

Related Questions