castle-bravo
castle-bravo

Reputation: 1429

How can I reduce this tree in parallel in Haskell?

I have a simple tree which stores a sequence of values in its leaves and some simple functions to facilitate testing.

If I have an unbounded number of processors and the tree is balanced, I should be able reduce the tree using any binary associative operation (+, *, min, lcm) in logarithmic time.

By making Tree an instance of Foldable, I can reduce the tree sequentially from left to right or right to left using built-in functions, but this takes linear time.

How can I use Haskell to reduce such a tree in parallel?

{-# LANGUAGE DeriveFoldable #-}

data Tree a = Leaf a | Node (Tree a) (Tree a)
            deriving (Show, Foldable)

toList :: Tree a -> [a]
toList = foldr (:) []

range :: Int -> Int -> Tree Int
range x y
  | x < y     = Node (range x y') (range x' y)
  | otherwise = Leaf x
  where
    y' = quot (x + y) 2
    x' = y' + 1

Upvotes: 1

Views: 349

Answers (2)

ErikR
ErikR

Reputation: 52049

Update

I originally answered the question under the assumption that the reduction operation was not expensive. Here's an answer which performs an associative reduction in chunks of n elements.

That is, suppose op is an associative binary operation and you want to compute foldr1 op [1..6], here's code which will evaluate it as:

(op (op 1 2) (op 3 4)) (op 5 6)

which allows for parallel evaluation.

import Control.Parallel.Strategies
import System.TimeIt
import Data.List.Split
import Debug.Trace

recChunk :: ([a] -> a) -> Int -> [a] -> a
recChunk op n xs =
  case chunksOf n xs of
    [a] -> op a
    cs  -> recChunk op n $ parMap rseq op cs

data N = N Int | Op [N]
  deriving (Show)

test1 = recChunk Op 2 $ map N [1..10]
test2 = recChunk Op 3 $ map N [1..10]

fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

fib' n | trace msg False = undefined
  where msg = "fib called with " ++ show n
fib' n = fib n

sumFib :: [Int] -> Int
sumFib xs | trace msg False = undefined
  where msg = "sumFib: " ++ show xs
sumFib xs = seq s (s + (mod (fib' (40 + mod s 2)) 1))
  where s = sum xs

main = do
  timeIt $ print $ recChunk sumFib 2 [1..20]

Original Answer

Since you have an associative operation, you can just use your toList function and evaluate the list in parallel with parMap or parList.

Here is some demo code which adds up the fib of each Leaf. I use parBuffer to avoid creating too many sparks - this is not needed if your tree is smallish.

I'm loading a tree from a file because it seemed that GHC with -O2 was detecting common sub-expressions in my test tree.

Also, adjust rseq to your needs - you may need rdeepseq depending on what you are accumulating.

{-# LANGUAGE DeriveFoldable #-}

import Control.Parallel.Strategies
import System.Environment
import Control.DeepSeq
import System.TimeIt
import Debug.Trace

fib 0 = 0
fib 1 = 1
fib n = fib (n-1) + fib (n-2)

fib' n | trace msg False = undefined
  where msg = "fib called with " ++ show n
fib' n = fib n

data Tree a = Leaf a | Node (Tree a) (Tree a)
            deriving (Show, Read, Foldable)

toList :: Tree a -> [a]
toList = foldr (:) []

computeSum :: Int -> Tree Int -> Int
computeSum k t = sum $ runEval $ parBuffer k rseq $ map fib' $ toList t

main = do
  tree <- fmap read $ readFile "tree.in"
  timeIt $ print $ computeSum 4 tree
  return ()

Upvotes: 1

Daniel Wagner
Daniel Wagner

Reputation: 153102

The naive fold is written this way:

cata fLeaf fNode = go where
    go (Leaf z) = fLeaf z
    go (Node l r) = fNode (go l) (go r)

I suppose the parallel one would be pretty simply adapted:

parCata fLeaf fNode = go where
    go (Leaf z) = fLeaf z
    go (Node l r) = gol `par` gor `pseq` fNode gol gor where
        gol = go l
        gor = go r

But could even be written in terms of cata:

parCata fLeaf fNode = cata fLeaf (\l r -> l `par` r `pseq` fNode l r)

Upvotes: 2

Related Questions