Shoe
Shoe

Reputation: 76240

Keep track of progress of a `map`

I have a map operation (that is actually run in parallel using parMap from Control.Parallel.Strategies) that takes quite a while. Given that I know how many times the function is applied (n in this context), how can I easily display, every once in a while, how many of the n applications have been evaluated?

The obvious solution would be to make the map a mapM with some putStr inside the mapping function, but that would:

So, is there a way to keep track of this information, that I'm missing, that avoids these problems?

Upvotes: 11

Views: 278

Answers (2)

j.p.
j.p.

Reputation: 1041

In production you probably shouldn't use trace and are forced to deal with the complications of needing IO, but for tests you could modify the definition of parMap to take another parameter telling when to emit a count:

import Control.Monad (sequence)
import Control.Parallel.Strategies (Strategy, using, rseq, rparWith, parMap)
import Debug.Trace (traceShow)
import System.IO (hFlush, hSetBuffering, BufferMode(NoBuffering), stdout)

evalList' :: Integer -> Strategy a -> Strategy [a]
evalList' t s as = sequence $ foldr f [] $ zip as [1..]
  where f (a, n) ss | n `mod` t == 0 = s (traceShow n a):ss
                    | otherwise      = s a:ss

parList' :: Integer -> Strategy a -> Strategy [a]
parList' t s = evalList' t (rparWith s)

parMap' :: Integer -> Strategy b -> (a -> b) -> [a] -> [b]
parMap' t s f xs = map f xs `using` parList' t s

-- some work to do
fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib(n-2)

main = do hSetBuffering stdout NoBuffering
          print $ sum (parMap' 1000 rseq (fib.(+20).(`mod` 5)) ([1..10000]::[Integer]))

If the work packages given by each list element become to small, you could adapt parListChunk instead accordingly.

Upvotes: 2

chi
chi

Reputation: 116139

One could try to craft this behaviour using timeout.

seconds :: Int
seconds = 1000000

progress :: [a] -> IO ()
progress [] = return ()
progress l@(x:xs) =
  do r <- timeout (5 * seconds) x  -- 5s
     threadDelay (2 * seconds)     -- 2s more delay
     case r of
       Nothing -> progress l  -- retry
       Just y  -> do putStrLn "one done!"
                     progress xs

Be careful since I fear that timeout is aborting the computation. If there's another thread that evaluates x that should be fine, but if this is the only thread doing that it could cause a livelock if 5 seconds are not enough.

Upvotes: 0

Related Questions