Time complexity mapping multiple times over a list

Question

If I for example apply map (just an example, could be filter or something else as well) on the same list 3 times, is Haskell going to pass through the list three times or only once? Example to run in ghci:

map (+1) $ map (*2) $ map (^2) [1..100]

What I'm wondering is if the complexity is 3n where n is the size of the list as it would be in an imperative language, or if the laziness in Haskell optimizes it to 1n by only going through the list once, and doing the three operations on each element at the same time. So with 1 it would

raise it to 2
multiply it by 2
add 1

and then move on to the next element, instead of first going to through the whole list while raising every element to 2, then going through the list again and multiplying every element by and then go through the list a third time to increment every element by one.

So which is, it? Is Haskell going through the list 3 times or only once?

jpmarinier · Accepted Answer

You can do a simple check under ghci: set statistics mode on, and try running the expression both ways but with a significant size. Sum the list and print its sum to force full evaluation.

Like this:

$ ghci
 GHCi, version 8.8.4: https://www.haskell.org/ghc/  :? for help
 λ> 
 λ> :set +s
 λ> 
 λ> n = 1000000
 (0.00 secs, 24,296 bytes)
 λ> 
 λ> sum ( map (+1) $ map (*2) $ map (^2) [1..n] )
 666667666668000000
 (1.36 secs, 865,692,136 bytes)
 λ> 
 λ> sum ( map  ((+1) . (*2) . (^2))  [1..n] )
 666667666668000000
 (1.03 secs, 753,692,056 bytes)
 λ>

So there is some performance gain in using the optimized expression, but it is much less than 3X.

Addendum:

Of course, for the sake of completeness, we also have to check with GHC code compiled with -O2, using this syntax on the command line to get the statistics:

$ myHaskellExe +RTS -s -RTS
...

The performance is much better than with ghci, so let's time for 100 millions rather than just one million.

Source code #1:

sumx1 :: Integer -> Integer
sumx1 n = sum ( map (+1) $ map (*2) $ map (^2) [1..n] )

main:: IO ()
main = do
    let  k   = 100*1000*1000
         res = sumx1 k
    putStrLn $ "k=" ++ (show k) ++ "  res1=" ++ (show res)

Result #1:

k=100000000  res1=666666676666666800000000
  12,679,831,656 bytes allocated in the heap
...
  GC      time    0.029s  (  0.028s elapsed)
  Total   time    5.457s  (  5.467s elapsed)

Source code #2:

sumx2 :: Integer -> Integer
sumx2 n = sum ( map  ((+1) . (*2) . (^2))  [1..n] )

main:: IO ()
main = do
    let  k   = 100*1000*1000
         res = sumx2 k
    putStrLn $ "k=" ++ (show k) ++ "  res2=" ++ (show res)

Result #2:

k=100000000  res2=666666676666666800000000
  12,679,831,656 bytes allocated in the heap
...
  GC      time    0.030s  (  0.029s elapsed)
  Total   time    5.500s  (  5.512s elapsed)

Hence, with GHC the performance difference between the two expressions is insignificant. High-level optimizer at work probably.

Time complexity mapping multiple times over a list

Answers (1)

Addendum:

Related Questions