EkcenierK
EkcenierK

Reputation: 1439

How do alternative uses of recursion influence the efficiency of Haskell list reverse?

I am giving myself a refresher on Haskell after a long break. I am working through 99 Haskell Problems. I wrote a solution for #5, and then took a look at the alternative solutions to stretch my knowledge a bit. The following is an extract from that solutions page for #5:

Reverse a list.

The standard definition, found in the prelude, is concise, but not very readable. Another way to define reverse is:

reverse :: [a] -> [a]
reverse [] = []
reverse (x:xs) = reverse xs ++ [x]

However this definition is more wasteful than the one in Prelude as it repeatedly reconses the result as it is accumulated. The following variation avoids that, and thus computationally closer to the Prelude version.

reverse :: [a] -> [a]
reverse list = reverse' list []
  where
    reverse' [] reversed     = reversed
    reverse' (x:xs) reversed = reverse' xs (x:reversed)

Could someone explain why the second solution is more efficient than the first? I can see that both solutions use recursion, but I'm not able to understand yet how the differences influence the efficiency, is it something to do with Haskell traversing the lists?

Upvotes: 0

Views: 226

Answers (1)

Sebastian Redl
Sebastian Redl

Reputation: 71989

The first solution uses recursion, but it isn't tail recursion, where the recursive call is the very last thing the function does. Tail recursion is more efficient than non-tail recursion because the function doesn't need to continue executing after the function call, so its stack frame etc. can be reused.

The first solution first does the recursive call, and then calls the concatenation operator on the result and the single element list. The second solution first prepends the element to the list, and then calls itself again, without needing to do anything further.

Adding to that is that ++ needs linear time in the length of its first argument, because it needs to duplicate every node of the list so that it can change the last node to point to the first node of the right-hand-side list. This is because the first variant does its actual work inside out (first it calls itself recursively, then after that returns, it does the actual work, which means that the inner calls will do their work first), or in other words it processes the individual list elements right-to-left. So when it encounters a new element (which was further to the left in the original list) it needs to add it to the right side of the list. In Haskell lists, adding to the right side is expensive, adding to the left side is cheap.

The second solution works outside-in, or left-to-right, which is the right execution order for this problem.

Upvotes: 3

Related Questions