Danny Navarro
Danny Navarro

Reputation: 2753

Finding a linear path in a traversable structure

Given a list of steps:

>>> let path = ["item1", "item2", "item3", "item4", "item5"]

And a labeled Tree:

>>> import Data.Tree
>>> let tree = Node "item1" [Node "itemA" [], Node "item2" [Node "item3" []]]

I'd like a function that goes through the steps in path matching the labels in tree until it can't go any further because there are no more labels matching the steps. Concretely, here it falls when stepping into "item4" (for my use case I still need to specify the last matched step):

>>> trav path tree
["item3", "item4", "item5"]

If I allow [String] -> Tree String -> [String] as the type of trav I could write a recursive function that steps in both structures at the same time until there are no labels to match the step. But I was wondering if a more general type could be used, specifically for Tree. For example: Foldable t => [String] -> t String -> [String]. If this is possible, how trav could be implemented?

I suspect there could be a way to do it using lens.

Upvotes: 1

Views: 114

Answers (2)

J. Abrahamson
J. Abrahamson

Reputation: 74354

We'll take as reference the following recursive model

import           Data.List   (minimumBy)
import           Data.Ord    (comparing)
import           Data.Tree

-- | Follows a path into a 'Tree' returning steps in the path which
-- are not contained in the 'Tree'
treeTail :: Eq a => [a] -> Tree a -> [a]
treeTail [] _ = []
treeTail (a:as) (Node a' trees)
  | a == a'   = minimumBy (comparing length) 
              $ (a:as) : map (treeTail as) trees
  | otherwise = as

which suggests that the mechanism here is less that we're traversing through the tree accumulating (which is what a Traversable instance might do) but more that we're stepping through the tree according to some state and searching for the deepest path.

We can characterize this "step" by a Prism if we like.

import Control.Lens

step :: Eq a => a -> Prism' (Tree a) (Forest a)
step a = 
  prism' (Node a)
         (\n -> if rootLabel n == a 
                  then Just (subForest n)
                  else Nothing)

This would allow us to write the algorithm as

treeTail :: Eq a => [a] -> Tree a -> [a]
treeTail [] _ = []
treeTail pth@(a:as) t =
  maybe (a:as)
        (minimumBy (comparing length) . (pth:) . map (treeTail as))
        (t ^? step a)

but I'm not sure that's significantly more clear.

Upvotes: 2

leftaroundabout
leftaroundabout

Reputation: 120711

First, please let's use type Label = String. String is not exactly descriptive and might not be ideal in the end...

Now. To use Traversable, you need to pick a suitable Applicative that can contain the information you need for deciding what to do in its "structure". You only need to pass back information after a match has failed. That sounds like some Either!

A guess would thus be Either [Label] (t Label) as the pre-result. That would mean, we use the instantiation

    traverse :: Traversable t
      => (Label -> Either [Label] Label) -> t Label -> Either [Label] (t Label)

So what can we pass as the argument function?

travPt0 :: [Label] -> Label -> Either [Label] Label
travPt0 ls@(l0 : _) label
   | l0 /= label   = Left ls
   | otherwise     = Right label     ?

The problem is, traverse will then fail immediately and completely if any node has a non-matching label. Traversable doesn't actually have a notion of "selectively" diving down into a data structure, it just passes through everything, always. Actually, we only want to match on the topmost node at first, only that one is mandatory to match at first.

One way to circumvent immediate deep-traversal is to first split up the tree into a tree of sub-trees. Ok, so... we need to extract the topmost label. We need to split the tree in subtrees. Reminds you of anything?

trav' :: (Traversable t, Comonad t) => [Label] -> t Label -> [Label]
trav' (l0 : ls) tree
  | top <- extract tree
             = if top /= l0 then l0 : ls
               else let subtrees = duplicate tree
                    in  ... ?

Now amongst those subtrees, we're basically interested only in the one that matches. This can be determined from the result of trav': if the second element is passed right back again, we have a failure. Unlike normal nomenclature with Either, this means we wish to go on, but not use that branch! So we need to return Either [Label] ().

               else case ls of
                     []     -> [l0]
                     l1:ls' -> let subtrees = duplicate tree
                               in  case traverse (trav' ls >>> \case
                                                     (l1':_)
                                                       | l1'==l1 -> Right ()
                                                     ls''        -> Left ls''
                                                 ) subtrees of
                                      Left ls'' -> ls''
                                      Right _   -> l0 : ls   -- no matches further down.

I have not tested this code!

Upvotes: 2

Related Questions