Reputation: 705

Flattening a binary tree in a specific manner

Consider the following definitions of binary and unary trees, a function flatten, which converts binary and unary trees to lists (e.g, flatten (Node (Leaf 10) 11 (Leaf 20)) is [10,11,20]) and a function, reverseflatten, which converts lists to binary trees (in the specific manner described here (Defining a function from lists to binary and unary trees) and illustrated in the picture below):

data Tree a = Leaf a | Node (Tree a) a (Tree a) | UNode a (Tree a) deriving (Show)
flatten :: Tree a -> [a]
flatten (Leaf x) = [x] 
flatten (Node l x r) = flatten l ++ [x] ++ flatten r
flatten (UNode l x) = [l] ++ flatten x

reverseflatten :: [a] -> Tree a
reverseflatten [x] = (Leaf x)
reverseflatten [x,y] = UNode x (Leaf y)
reverseflatten [x,y,z] = Node (Leaf x) y (Leaf z)
reverseflatten (x:y:xs) = revflat2 (x:y:xs)

revflat2 :: [a] -> Tree a
revflat2 [x] = (Leaf x)
revflat2 [x,y] = UNode y (Leaf x)
revflat2 [x,y,z] = Node (Leaf x) y (Leaf z)
revflat2 (x:y:xs) = Node (Leaf x) y (revflat2 ([head $ tail xs] ++ [head xs] ++ tail (tail xs)))

reverseflatten [1..5] is Node (Leaf 1) 2 (Node (Leaf 4) 3 (Leaf 5), but (reverseflatten(flatten(reverseflatten [1..5]))) does not return the same as reverseflatten [1..5]. How could flatten be modified so that reverseflatten x: xs is the same as (reverseflatten(flatten(reverseflatten x:xs)))?

reverseflatten forms the series of trees in the picture below. For example, reverseflatten [x,y,z] forms Tree 2 in the picture, reverseflatten [x,y,z, x'] forms Tree 3, reverseflatten [x,y,z, x', y'] forms Tree 4, reverseflatten [x,y,z, x', y', z'] forms Tree 5, reverseflatten [x,y,z, x', y', z', x''] forms Tree 6, etcetera. Series of trees generated by <code>reverseflatten</code>

What I want is that reverseflatten x: xs is the same as (reverseflatten(flatten(reverseflatten x:xs))). So I need to design flatten so it has this effect.

I have made the following attempt (where the case flatten Node l x r is supposed to divide into a case in which r is a leaf, and a case where it is not):

flatten :: Tree a -> [a]
flatten (Leaf x) = [x] 
flatten (UNode l x) = [l] ++ flatten x 
flatten (Node l x r)
    | r == Leaf y   = [l, x, r]  
    | otherwise = flatten (Node l x (revflat2 ([head $ tail r] ++ [head r]     ++ tail (tail r)))

but this produces:

experiment.hs:585:1: error:
    parse error (possibly incorrect indentation or mismatched brackets)
    |
585 | flatten (UNode l x) = [l] ++ flatten x 
    | ^

Upvotes: 0

Answers (3)

Daniel Wagner

Reputation: 153182

Let's hypothesize a slightly stronger property and just calculate without thinking, and see where it gets us. Namely, that stronger property will be that whenever xs is not empty, we have:

flatten (reverseflatten xs) = xs

From the definition of reverseflatten, there are four cases to consider. The first is this:

flatten (reverseflatten [x]) = [x]
flatten (Leaf x) = [x]

flatten (reverseflatten [x,y]) = [x,y]
flatten (UNode x (Leaf y)) = [x,y]

Then:

flatten (reverseflatten [x,y,z]) = [x,y,z]
flatten (Node (Leaf x) y (Leaf z)) = [x,y,z]

Finally:

flatten (reverseflatten (x:y:xs)) = x:y:xs
flatten (revflat2 (x:y:xs)) = x:y:xs

Because the previous patterns have captured the situations where xs matches [] or [_], we need only consider one case of revflat2, namely, the one where xs has at least two elements.

flatten (revflat2 (x:y:w:z:xs)) = x:y:w:z:xs
flatten (Node (Leaf x) y (revflat2 (z:w:xs))) = x:y:w:z:xs

Aha! For this to work, it would be nice to have a helper with a new property, namely:

flatten2 (revflat2 (z:w:xs)) = w:z:xs

(We'll actually use the names x and y instead of w and z, of course.) Once again let us calculate without thinking. There are three cases for xs, namely [], [_], and longer. When xs is []:

flatten2 (revflat2 [x,y]) = [y,x]
flatten2 (UNode y (Leaf x)) = [y,x]

For [_]:

flatten2 (revflat2 [x,y,z]) = [y,x,z]
flatten2 (Node (Leaf x) y (Leaf z)) = [y,x,z]

And for longer:

flatten2 (revflat2 (x:y:w:z:xs)) = y:x:w:z:xs
flatten2 (Node (Leaf x) y (revflat2 (z:w:xs))) = y:x:w:z:xs

By induction hypothesis, we have flatten2 (revflat2 (z:w:xs)) = w:z:xs, so this last equation can become:

flatten2 (Node (Leaf x) y rest) = y:x:flatten2 rest

Now we can just take all the final lines of each of these cases and they make a program:

flatten (Leaf x) = [x]
flatten (UNode x (Leaf y)) = [x,y]
flatten (Node (Leaf x) y (Leaf z)) = [x,y,z]
flatten (Node (Leaf x) y rest) = x:y:flatten2 rest

flatten2 (UNode y (Leaf x)) = [y,x]
flatten2 (Node (Leaf x) y (Leaf z)) = [y,x,z]
flatten2 (Node (Leaf x) y rest) = y:x:flatten2 rest

Is this the best program? No! In particular, it's partial -- you have some free choices you can make about what flatten and flatten2 should do when the first tree argument to a Node or UNode is not a Leaf (but no matter what choice you make it will not affect the property you care about) and about what flatten2 should do with leaves. Likely if you make sane choices here, you can coalesce many of the patterns.

But what's nice about this process is that it's completely mechanical: you can take your property of interest, turn a crank, and get out a function with that property (or conflicting equations that tell you it's not possible and why). Only once you have something that works do you need to stare and think about what would make it prettier or better. Yay, equational reasoning!

Upvotes: 1

Li-yao Xia

Reputation: 33519

Testable specification

First we can implement your specification reverseflatten (flatten (reverseflatten (x : xs))) = reverseflatten (x : xs) as a QuickCheck property.

We parameterize it by flatten and reverseflatten so it is easy to plug in different implementations.
We specialize the element type to Int because we have to tell QuickCheck what to generate at some point.
The type variable a really means Tree Int, but the generality will be useful later.

import Test.QuickCheck

prop_flat :: (Eq a, Show a) =>
             (a -> [Int]) -> ([Int] -> a) -> (Int, [Int]) -> Property
prop_flat f rf (x0, xs0) =
    (rf . f . rf) xs === rf xs
  where
    xs = x0 : xs0

-- Also remember to derive both Show and Eq on Tree.

We can check that it's a nontrivial property by applying it to the incorrect implementation.

ghci> quickCheck $ prop_flat flatten reverseflatten
*** Failed! Falsifiable (after 5 tests and 8 shrinks):    
(0,[0,0,1,0])
Node (Leaf 0) 0 (Node (Leaf 0) 1 (Leaf 0)) /= Node (Leaf 0) 0 (Node (Leaf 1) 0 (Leaf 0))

Flatten, first take

Now the implementation of flatten needs to be split in two stages, like reverseflatten, because the root behaves differently from the other nodes:

at the root, Node (Leaf x) y (Leaf z) → [x, y, z],
but in the inner nodes, Node (Leaf x) y (Leaf z) → [y, x, z]

Also note that all the trees you've shown, and those that can actually be generated by reverseflatten lean to the right, so we really only know what to do on patterns Leaf x, UNode x (Leaf y) and Node (Leaf x) y r, but not other patterns like UNode x (Node ...) or Node (Node ...) y r. Hence, considering the whole domain of Trees, flatten1 is highly partial:

flatten1 :: Tree a -> [a]
flatten1 (Leaf x) = [x]
flatten1 (UNode x (Leaf y)) = [x, y]
flatten1 (Node (Leaf x) y r) = x : y : flatten1' r

flatten1' :: Tree a -> [a]
flatten1' (Leaf x) = [x]
flatten1' (UNode x (Leaf y)) = [x, y]
flatten1' (Node (Leaf y) x r) = x : y : flatten1' r

Partiality notwithstanding, QuickCheck agrees:

ghci> quickCheck $ prop_flat flatten1 reverseflatten
+++ OK, passed 100 tests.

Flatten, total version

A total function can be obtained by generalizing the patterns a bit, but as the test above shows, the specification does not cover these extra cases. Whenever we pattern match on a nested Leaf y, we instead just get the whole tree ys and flatten it. If it does turn out to be ys = Leaf y, then it will be flattened to a singleton list, so the original semantics are preserved.

flatten2 :: Tree a -> [a]
flatten2 (Leaf x) = [x]
flatten2 (UNode x ys) = x : flatten2 ys
flatten2 (Node xs y r) = flatten2 xs ++ y : flatten2' r

flatten2' :: Tree a -> [a]
flatten2' (Leaf x) = [x]
flatten2' (UNode x ys) = x : flatten2' ys
flatten2' (Node ys x r) = x : flatten2' ys ++ flatten2' r

Flatten, totally specified version

Rather than arbitrarily generalizing the function on the unspecified part of its domain, we can also restrict its domain to match exactly the specification. This leads to an alternative type definition: in all examples, UNode only has a leaf subtree, and similarly Node has only a leaf as the left subtree, so we unpack those leaves into the constructors.

data Tree' a = Leaf' a | UNode' a a | Node' a a (Tree' a)
  deriving (Eq, Show)

The implementation of flatten' is a straightforward adaptation of flatten1:

flatten' :: Tree' a -> [a]
flatten' (Leaf' x) = [x]
flatten' (UNode' x y) = [x, y]
flatten' (Node' x y r) = x : y : f'' r

f'' :: Tree' a -> [a]
f'' (Leaf' x) = [x]
f'' (UNode' x y) = [x, y]
f'' (Node' x y r) = y : x : f'' r

reverseflatten' is similarly adapted from a refactored version of reverseflatten.

reverseflatten' :: [a] -> Tree' a
reverseflatten' (x : []) = Leaf' x
reverseflatten' (x : y : []) = UNode' x y
reverseflatten' (x : y : z : r) = Node' x y (rf'' z r)

rf'' :: a -> [a] -> Tree' a
rf'' x [] = Leaf' x
rf'' x (y : []) = UNode' x y
rf'' x (y : z : r) = Node' y x (rf'' z r)

QuickCheck validates:

ghci> quickCheck $ prop_flat flatten' reverseflatten'
+++ OK, passed 100 tests.

Upvotes: 1

Lorenzo

Reputation: 2210

I think that your problem is that the first node of the tree does not have the same pattern as the others, as in if you look at Tree1 it goes [x,y,z] , whereas Tree4 goes [x,y,[x',z,y']].

You can see that the ordering of the child nodes do not follow that of the first one, which is why some people noted it feels un-natural. To fix it you can either change your definition of reverseFlattening to one that has a constant pattern, which I assume you don't want, or change your flatten to take this weird pattern into account:

data Tree a = Leaf a | Node (Tree a) a (Tree a) | UNode a (Tree a) deriving (Show)

reverseFlatten :: [a] -> Tree a
reverseFlatten [x] = (Leaf x)
reverseFlatten [x,y] = UNode y (Leaf x)
reverseFlatten [x,y,z] = Node (Leaf x) y (Leaf z)
reverseFlatten (x:y:xs) = Node (Leaf x) y (reverseFlatten ((xs !! 1) : (head xs) : (drop 2 xs)))

flatten :: Tree a -> [a]
flatten (Leaf x)            = [x]
flatten (UNode l (Leaf x))  = [l,x]
flatten (Node (Leaf l) x r) = l : x : flattenRest r

flattenRest :: Tree a -> [a]
flattenRest (Leaf x)            = [x]
flattenRest (UNode l (Leaf x))  = [l,x]
flattenRest (Node (Leaf l) x r) = x : l : flattenRest r

Note that I extended the pattern matching for your UNode and the left Node as you know already it will be a left-sided tree so there's no need to call your function if you know already what the result will be.

Upvotes: 3