Reputation: 305
Let's say we have a simple tree creating algorithm in Haskell:
data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
makeTree :: Tree Int -> Tree Int
makeTree (Node 0 l r) = Node 0 EmptyTree EmptyTree
makeTree (Node n l r) = Node n (makeTree $ newTree (n - 1))
(makeTree $ newTree (n - 1))
where
newTree n = Node n EmptyTree EmptyTree
For extremely large numbers, we would expect this algorithm to fail with a "stack size overflow" error. This would be because the algorithm is binary recursive, not tail recursive. Can I use bang patterns (on the resulting left subtree "!(makeTree $ newTree (n - 1))") to guide the binary recursion into tail recursion, as the recursion should now be directed due to strictness?
EDIT:
It turns out that the real issue is not the creation of the tree, but the functions that consume the tree. There is another function used to flatten the tree, where the instance is as follows:
import qualified Data.Foldable as F
instance F.Foldable Tree where
foldMap f EmptyTree = mempty
foldMap f (Node x l r) = F.foldMap f l `mappend`
f x `mappend`
F.foldMap f r
flatten = F.foldMap (:[])
The flattening of the tree is thus invoked and it is probably here where the overflow occurs. If so, is the solution as simple as hypothetically the conversion of foldl to foldl'? Or is the binary folding going to add additional problems?
Upvotes: 1
Views: 139
Reputation: 63359
I assume that you meant to create a very deep tree, like
data Tree a = EmptyTree | Node a (Tree a) (Tree a) deriving (Show, Read, Eq)
makeTree :: Int -> Tree Int
makeTree 0 = EmptyTree
makeTree n = Node n (makeTree (n - 1)) (makeTree (n - 1))
The key is that Haskell is lazy. So after calling the function, nothing is actually created, except a thunk that is evaluated when needed. Nothing is allocated on stack, because the call to makeTree
doesn't involve a recursive call. After the root node is examined, the recursive calls are invoked, but again only one level. This means that examining each node costs only some finite time and memory (in this case constant), not depending on the depth of the tree. The important property is that every recursive call is inside a constructor. This is sometimes called corecursion or guarded recursion.
It's possible a stack overflow will occur in a function that consumes the tree, but this is a different matter.
Upvotes: 6