hisoka
hisoka

Reputation: 354

Extracting specific types from a tree

I have built a tree from which I want to collect all the Leaf types:

Branch [] (Branch [0] (Leaf [0,1]) (Branch [0] (Leaf [0,2]) (Branch
[0] (Leaf [0,3]) (Leaf [0])))) (Branch [] (Branch [1] (Leaf [1,2])
(Branch [1] (Leaf [1,3]) (Leaf [1]))) (Branch [] (Branch [2] (Leaf
[2,3]) (Leaf [2])) (Branch [] (Leaf [3]) (Leaf []))))

What I get as a type in GHCI (:t) of the above variable is :

Tree [Int]

The data structure is the following:

data Tree a = Empty | Leaf a | Branch a (Tree a) (Tree a)

I am trying to isolate ONLY the leaves such that I would obtain :

[ [0,1], [0,2] .. [3], [] ]

I've been trying to run a filter on the results, however that doesn't work. I've tried to using the function Data.Foldable.toList, however it pulls all the branches in as well and results in a large list of lists with multiple duplicates and no possibility to tell whether it's a branch or a leaf.

Upvotes: 6

Views: 251

Answers (3)

Alex
Alex

Reputation: 19124

As @BenjaminHodgson notes in the comments the following solution is valid, but cannot have a consistent implementation across other typeclasses. For example making Tree an instance of Traversable would result in an implementation of the traverse function that necessarily touches all elements of the Tree. I am leaving this up for learning purposes, but don't use this solution.

Using the Foldable typeclass:

import qualified Data.Foldable as F

instance F.Foldable Tree where
    foldMap f Empty = mempty
    foldMap f (Branch x l r) = foldMap f l `mappend`
                               foldMap f r
    foldMap f (Leaf x) = f x

For your tree:

Prelude> foldr (\x acc -> x: acc) [] tree
[[0,1],[0,2],[0,3],[0],[1,2],[1,3],[1],[2,3],[2],[3],[]]

Upvotes: 0

Benjamin Hodgson
Benjamin Hodgson

Reputation: 44664

A more advanced technique, which saves you the effort of writing and maintaining recursive functions by hand, is to use a generic programming library such as lens's Plated module.

Here's how Plated works: you describe how to identify a value's children - immediate substructures having the same type as the value itself - by writing an instance of the Plated class, and the library's various higher-order functions take care of recursively finding the children's children and so on. In the case of your Tree datatype, only the Branch constructor has children (the left and right children), so those are the only places we apply f.

instance Plated (Tree a) where
    plate f (Branch x l r) = Branch x <$> f l <*> f r
    plate f t = pure t

(If you're willing to derive Data then you don't even have to write plate.)

Plated's universe function can now recursively search a tree's children, and the children's children, and so on, returning a lazy list which yields every node in the tree. It works roughly like this:

universe :: Plated a => a -> [a]
universe t = t : [descendant | child <- toListOf plate t, descendant <- universe child]

So to find all the leaves, you just have to filter this list to search for Leaf constructors.

leaves :: Tree a -> [a]
leaves t = [x | Leaf x <- universe t]

Job done!

Upvotes: 1

willeM_ Van Onsem
willeM_ Van Onsem

Reputation: 477794

Although other approaches exist, probable the easiest one is to use recursion. Here the base cases are Empty and Leaf. In case of an Empty, we return an empty list, in case of a Leaf, we can return a list with one element: the one wrapped in the leaf, so:

leave_elements :: Tree a -> [a]
leave_elements Empty = []
leave_elements (Leaf x) = [x]
leave_elements (Branch ...) = ...

We still need to fill in the Branch case, here we see three elements in the constructor: a, which we can ignore, and the two subtrees. We can recursively call the leave_elements recursively on the subtrees, and append the lists of the data of the leaves of the subtrees. For example:

leave_elements :: Tree a -> [a]
leave_elements Empty = []
leave_elements (Leaf x) = [x]
leave_elements (Branch _ l r) = leave_elements l ++ leave_elements r

For your given sample tree, this produces:

Prelude> leave_elements (Branch [] (Branch [0] (Leaf [0,1]) (Branch [0] (Leaf [0,2]) (Branch [0] (Leaf [0,3]) (Leaf [0])))) (Branch [] (Branch [1] (Leaf [1,2]) (Branch [1] (Leaf [1,3]) (Leaf [1]))) (Branch [] (Branch [2] (Leaf [2,3]) (Leaf [2])) (Branch [] (Leaf [3]) (Leaf [])))))
[[0,1],[0,2],[0,3],[0],[1,2],[1,3],[1],[2,3],[2],[3],[]]

We can also boost performance by using for instance a tail we pass recursively:

leave_elements :: Tree a -> [a]
leave_elements = go []
    where go tl Empty = tl
          go tl (Leaf x) = (x:tl)
          go tl (Branch _ l r) = go (go tl r) l

Or we can work with Data.DList:

import Data.DList

leave_elements :: Tree a -> [a]
leave_elements = toList . go
    where go Empty = empty
          go (Leaf x) = singleton x
          go (Branch _ l r) = append (go l) (go r)

Upvotes: 3

Related Questions