Reputation: 61

efficiency of list comprehensions in Haskell

I've been learning Haskell and I made a few statistical functions that would work on lists of either Int or Float types.
Here are two examples:

mean :: (Real a, Floating b) => [a] -> b
mean xs = realToFrac (sum xs) / genericLength xs

sumOfSquares :: (Real a, Floating b) => [a] -> b
sumOfSquares xs = [(mean xs - realToFrac x) ^^ 2 | x <- xs]

However, I am now wondering if I could improve the runtime by writing sumOfSquares differently. Like this:

sumOfSquares xs = [(m - realToFrac x) ^^ 2 | x <- xs] where m = mean xs

or like this:

sumOfSquares xs = [(m - realToFrac x) ^^ 2 | x <- xs, let m = mean xs]

I don't know how Haskell handles behind the screen. In the original version, does the computer calculate mean xs for each entry in xs? I'm assuing Haskell doesn't optimize comprehensions in the background. Am I wrong?

Upvotes: 2

Answers (1)

Silvio Mayolo

Reputation: 70377

The desugaring does produce different code. There are three implementations you're considering here. I'll show what each of them would look like without the list comprehension.

-- With no let / where
sumOfSquares xs = [(mean xs - realToFrac x) ^^ 2 | x <- xs]
sumOfSquares xs = map (\x -> (mean xs - realToFrac x) ^^ 2) xs

-- With let inside the comprehension
sumOfSquares xs = [(m - realToFrac x) ^^ 2 | x <- xs, let m = mean xs]
sumOfSquares xs = map (\x -> let m = mean xs in (m - realToFrac x) ^^ 2) xs

-- With where outside the comprehension
sumOfSquares xs = [(m - realToFrac x) ^^ 2 | x <- xs] where m = mean xs
sumOfSquares xs = map (\x -> (m - realToFrac x) ^^ 2) xs where m = mean xs

So the first two are basically equivalent. If you have the world's simplest compiler (or you turn off optimizations), you'll get fairly inefficient code that recomputes the mean every iteration. The third avoids this completely, by actually putting the computation outside the where block. It doesn't guarantee that the value won't be recomputed every step (Haskell is free to compute things however it so chooses so long as the result behaves equivalently to lazy evaluation), but it does improve the odds if your compiler is not very smart.

However, realistically, GHC (and most other compilers) will perform what's called float-out optimization. Basically, if it sees a let block that it can move outside of a repeated computation, it is free to do so since there are no possible side effects that would change in doing so. In general, GHC will take expressions of the form

\x -> let y = something in f x y

and convert them to

let y = something in \x -> f x y

This optimization is allowed so long as x does not appear as a free variable inside something, and most compilers are fairly smart about doing this, so you shouldn't have much to worry about.

There's an excellent StackOverflow answer about a lot of common optimizations performed by GHC, from which this answer is partially inspired. I suggest checking it out if you want to learn more about efficiency in Haskell.

Upvotes: 2

efficiency of list comprehensions in Haskell

Answers (1)

Related Questions