Are there any problems with this Haskell function for strictly timing a computation?

Question

Recently I was trying to determine the time needed to calculate a waveform using the vector storage type.

I wanted to do so without requiring to print the length or something like that. Finally I came up with the following two definitions. It seems simple enough, and from what I can tell it prints a non-zero computation time as expected the first time I run the function, but I'm wondering if there are any laziness caveats here that I've missed.

import System.IO
import System.CPUTime
import qualified Data.Vector.Storable as V

timerIO f = do
  start <- getCPUTime
  x <- f
  let !y = x
  end <- getCPUTime
  let diff = (fromIntegral (end - start)) / (10^12)
  print $ "Computation time: " ++ show diff ++ " sec
"

timer f = timerIO $ do return f

main :: IO ()
main = do
  let sr = 1000.0
      time = V.map (/ sr) $ V.enumFromN 0 120000 :: V.Vector Float
      wave = V.map (\x -> sin $ x * 2 * pi * 10) time :: V.Vector Float

  timer wave
  timer wave

prints,

Computation time: 0.16001 sec
Computation time: 0.0 sec

Are there any hidden bugs here? I'm really not sure that the let with strictness flag is really the best way to go here. Is there a more concise way to write this? Are there any standard functions that already do this that I should know about?

Edit: I should mention that I had read about criterion but in this case I was not looking for a robust way to calculate average timing for profiling-only purposes; rather I was looking for a simple / low-overhead way to integrate single timers into my program for tracing the timing of some computations during normal running of the application. Criterion is cool, but this was a slightly different use case.

Daniel Fischer · Accepted Answer

If evaluating to weak head normal form is enough - for strict Vectors or UArrays it is -, then your timing code works well¹, however, instead of the bang pattern in the let-binding, you could put a bang on the monadic bind,

start <- getCPUTime
!x <- f
end <- getCPUTime

which to me looks nicer, or you could use Control.Exception.evaluate

start <- getCPUTime
evaluate f
end <- getCPUTime

which has the advantage of (supposed) portability, whereas bang patterns are a GHC extension. If WHNF is not enough, you would need to force full evaluation, for example using rnf or deepseq, like

start <- getCPUTime
!x <- rnf `fmap` f
end <- getCPUTime

However, repeatedly timing the same computation with that is hairy. If, as in your example, you give the thing a name, and call it

timer wave
timer wave

the compiler shares the computation, so it's only done once and all but the first timer calls return zero (or very close to zero) times. If you call it with code instead of a name,

timer (V.map (\x -> sin $ x * 2 * pi * 10) time :: V.Vector Float)
timer (V.map (\x -> sin $ x * 2 * pi * 10) time :: V.Vector Float)

the compiler can still share the computation, if it does common subexpression elimination. And although GHC doesn't do much CSE, it does some and I'm rather confident it would spot and share this (when compiling with optimisations). To reliably make the compiler repeat the computations, you need to hide the fact that they are the same from it (or use some low-level internals), which is not easy to do without influencing the time needed for the computation.

¹ It works well if the computation takes a significant amount of time. If it takes only a short time, the jitter introduced by outside influences (CPU load, scheduling, ...) will make single timings far too unreliable. Then you should do multiple measurements, and for that, as has been mentioned elsewhere, the criterion library is an excellent way to relieve you of the burden of writing robust timing code.

Are there any problems with this Haskell function for strictly timing a computation?

Answers (2)

Related Questions