Snoqual
Snoqual

Reputation: 563

Is (reverse . f . reverse) efficient?

Many times I see functions which operate on the head of a list, e.g:

trimHead ('\n':xs) = xs
trimHead xs        = xs

then I see the the definition:

trimTail = reverse . trimHead . reverse

then I see:

trimBoth = trimHead . trimTail

They are clean, but are trimTail and trimBoth efficient? Is there a better way?

Upvotes: 14

Views: 855

Answers (4)

fuz
fuz

Reputation: 93172

It isn't efficient in the sense, that streaming is impossible, because the whole list needs to be evaluated to get even a single element. But a better solution is difficult, as you need to evaluate the rest of the list to know, whether a line-break is to be trimmed or not. A slightly more efficient way would be to look ahead whether the linebreak is to be trimmed and react appropriately:

trimTail, trimHead, trimBoth :: String -> String
trimTail ('\n':xs) | all (=='\n') xs = ""
trimTail (x:xs)                      = x : trimTail xs

trimHead = dropWhile (=='\n')

trimBoth = trimTail . trimHead

The solution above evaluates only as much as needed from the string to know, if the linebreak is to be trimmed. An even better method would be to incorporate the knowledge, that the next n chars are not to be trimmed. Implementing this is left as an exercise to the reader.

An even better (and shorter) way to write trimTail is this way (by rotsor):

trimTail = foldr step [] where
  step '\n' [] = []
  step x xs = x:xs

Generally, try to avoid reverse. Usually there is a better way to solve the problem.

Upvotes: 5

Dan Burton
Dan Burton

Reputation: 53725

Are trimHead and trimTail efficient?

They both take O(n) time (time directly proportional to the size of the list) since the entire list must be traversed twice in order to perform the two reverses.

Is there a better way?

Well, do you have to use lists? With Data.Sequence you can modify either end of the list in constant time. If you're stuck with lists, then check out the other solutions suggested here. If you can use Sequences instead, then just modify FUZxxl's answer to use dropWhileR.

Upvotes: 2

hammar
hammar

Reputation: 139930

Consider this alternative implementation

trimTail2 [] = []
trimTail2 ['\n'] = []
trimTail2 (x:xs) = x : trimTail2 xs

trimBoth2 = trimHead . trimTail2

It's easy to confirm that trimTail and trimBoth require that the entire list be evaluated, while trimTail2 and trimBoth2 only evaluate as much of the list as is necessary.

*Main> head $ trimTail ('h':undefined)
*** Exception: Prelude.undefined
*Main> head $ trimBoth ('h':undefined)
*** Exception: Prelude.undefined
*Main> head $ trimTail2 ('h':undefined)
'h'
*Main> head $ trimBoth2 ('h':undefined)
'h'

This implies that your version is going to be less efficient if the whole result is not needed.

Upvotes: 13

wnoise
wnoise

Reputation: 9942

Assuming the whole list is to be evaluated (if you don't need the whole list, why are you trimming the end?), it's about half as efficient as you can get out of immutable lists, but it has the same asymptotic complexity O(n).

The new list requires at least:

  1. You have to find the end: n pointer traversals.
  2. You have to modify the end, and thus what points to the end, etc.: n cons of existing data with new pointers.

reverse . trimHead . reverse performs roughly twice this:

  1. The first reverse performs n pointer traversals and n cons.
  2. trimHead possibly performs 1 pointer traversal.
  3. The second reverse performs n pointer traversals and n cons.

Is this worth worrying about? In some circumstances, maybe. Is the code too slow, and is this called a lot? In others, maybe not. Benchmark! The implementation with reverse is nice and easy to understand, and that's important.

There is a fairly natural recursive step-through-the-list solution, which will only evaluate as much of the output as is consumed, so in the case that you don't know whether you need the whole string, you can possibly save some evaluation.

Upvotes: 5

Related Questions