Reputation: 23945
Please forgive me if I misused the big words in the title; I'm not too knowledgeable about them but hope they describe my problem. I wrote an elaborate scheme to try and encode strings according to these requirements. For strings of length 10^4 and higher, the code I wrote is quite slow, and I'm wondering - since it processes chunks of 200 at a time (albeit moving only one character forward sometimes to take the next chunk), could it be modified to output the result faster or in a more linear fashion (e.g., immediately output the result for each 200 characters processed). Any help with that or other noticeable optimizations would be appreciated.
Per tel's suggestion, I simplified my example:
encode xs = encode' xs [] where
encode' [] result = result
encode' (z:zs) result
| null test = encode' zs (result ++ [z])
| otherwise = encode' (drop numZsProcessed zs) (result ++ processed)
where test = ..some test
toProcess = take 200 (z:zs)
processed = ..do something complicated with toProcess
numZsProcessed = ..number of z's processed
Upvotes: 6
Views: 348
Reputation: 60513
Haskell and tail recursion don't get along as well as other functional languages and tail recursion. Let's do some manual reduction on some very simple code to see what's going on with tail recursion. Here's a tail-recursive implementation of map (1+)
.
go [] r = r
go (x:xs) r = go xs (r ++ [1+x])
Also we must keep in mind the definition of (++)
:
[] ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)
Now let's reduce go [1,2,3,4,5] []
. Keep in mind that [x,y,z]
is notation for x:(y:(z:[]))
, or for short x:y:z:[]
.
go [1,2,3,4,5] []
go [2,3,4,5] ([] ++ [2]) -- 2 here is actually the thunk 1+1, but
-- for compactness I am reducing earlier
go [3,4,5] (([] ++ [2]) ++ [3])
go [4,5] ((([] ++ [2]) ++ [3]) ++ [4])
go [5] (((([] ++ [2]) ++ [3]) ++ [4]) ++ [5])
go [] ((((([] ++ [2]) ++ [3]) ++ [4]) ++ [5]) ++ [6])
(((([] ++ [2]) ++ [3]) ++ [4]) ++ [5]) ++ [6]
((([2] ++ [3]) ++ [4]) ++ [5]) ++ [6]
(((2:([] ++ [3]) ++ [4]) ++ [5]) ++ [6]
((2:(([] ++ [3]) ++ [4]) ++ [5]) ++ [6]
(2:((([] ++ [3]) ++ [4]) ++ [5]) ++ [6]
2:(((([] ++ [3]) ++ [4]) ++ [5]) ++ [6]) -- first observable output
2:((([3] ++ [4]) ++ [5]) ++ [6])
2:((3:([] ++ [4]) ++ [5]) ++ [6])
2:(3:(([] ++ [4]) ++ [5]) ++ [6])
2:3:((([] ++ [4]) ++ [5]) ++ [6]) -- second observable output
2:3:(([4] ++ [5]) ++ [6])
2:3:(4:([] ++ [5]) ++ [6])
2:3:4:(([] ++ [5]) ++ [6]) -- third observable output
2:3:4:([5] ++ [6])
2:3:4:5:([] ++ [6]) -- fourth observable output
2:3:4:5:6:[] -- final output
See how each item in the output needs to work its way outward from a deeply nested series of parentheses? This causes it to take quadratic time in the size of the input to get all the output. You'll also see a behavior that the first few items are yielded slowly, and it gets faster and faster as you reach the end of the list. This reduction explains that.
The main performance problem here is appending the new element to the end of the list, which takes time proportional to the size of the list you are appending to. A better way is to cons on the front, which is a constant-time operation. This will cause the output to come out reversed, so you need to reverse the result.
go [] r = reverse r
go (x:xs) r = go xs ((1+x):r)
reverse xs = rev xs [] -- roughly from the report prelude
rev [] r = r
rev (x:xs) r = rev xs (x:r)
And, let's reduce:
go [1,2,3,4,5] []
go [2,3,4,5] [2]
go [3,4,5] [3,2]
go [4,5] [4,3,2]
go [5] [5,4,3,2]
go [] [6,5,4,3,2]
reverse [6,5,4,3,2]
rev [6,5,4,3,2] []
rev [5,4,3,2] [6]
rev [4,3,2] [5,6]
rev [3,2] [4,5,6]
rev [2] [3,4,5,6]
rev [] [2,3,4,5,6]
[2,3,4,5,6] -- first and all observable output!
So this is clearly less work than the first version. This is the style is used in strict languages like Scheme and ML, because it has good memory performance. However, it has some disadvantages:
reverse
, which is takes an extra O(n)
time and has nothing to do with what we are doing (what does reversing have to do with adding one to every element and preserving the order?).In a lazy language like Haskell, we can do better. Strangely, and beautifully, the way we do is by writing it even more naively.
go [] = []
go (x:xs) = (1+x):go xs
and reduce:
go [1,2,3,4,5]
2:(go [2,3,4,5]) -- first observable output
2:3:(go [3,4,5]) -- second observable output
2:3:4:(go [4,5]) -- third observable output
2:3:4:5:(go [6]) -- fourth observable output
2:3:4:5:6:(go []) -- fifth observable output
2:3:4:5:6:[] -- final output
It takes even less work, and it starts yielding output before even looking at the remainder of the list, so it has good performance in a stream computation and works on infinite inputs. And the implementation is about as simple and obvious as you could hope for.
I hope this gives you some intuition for how tail recursion works in Haskell. For your example, I suggest removing the tail recursion and rewriting in a naive style similar to our final go
, using the intuition I hope I suggested from this post to yield "as large a prefix of the input as possible, as soon as possible" (notice that returning x:xs
immediately yields x
, even if there is some more work to be done to compute xs
-- that's laziness in (non-)action).
Upvotes: 6