Clinton
Clinton

Reputation: 23135

Non-tree data structures in Haskell

Making tree like data structures is relatively easy in Haskell. However, what if I want a structure like the following:

    A (root)
   / \
  B   C
 / \ / \
D   E   F

So if I traverse down the structure through B to update E, the returned new updated structure also has E updated if I traverse through C.

Could someone give me some hints about how to achieve this? You can assume there are no loops.

Upvotes: 2

Views: 179

Answers (2)

icktoofay
icktoofay

Reputation: 128993

Once you've established identity, it can be done.

But first you must establish identity.

In many languages, values can be distinct from each other, but equal. In Python, for example:

>>> a = [1]
>>> b = [1]
>>> a == b
True
>>> a is b
False

You want to update E in one branch of the tree, and also update all other elements for which that element is E. But Haskell is referentially transparent: it has no notion of things being the same object; only equality, and even that is not applicable for every object.

One way you could do this is equality. Say this was your tree:

    __A__
   /     \
  B       C
 / \     / \
1   2   2   3

Then we could go through the tree and update all the 2s to, say, four. But this isn't exactly what you want in some cases.

In Haskell, if you want to update one thing in multiple places, you'll have to be explicit about what is and isn't the same thing. Another way you could deal with this is to tag each different value with a unique integer, and use that integer to determine identity:

              ____________A___________
             /                        \
            B                          C
           / \                        / \
(id=1)"foo"   (id=2)"bar"  (id=2)"bar"   (id=3)"baz"

Then we could update all values with an identity of 2. Accidental collisions cannot be a problem, as there can be no collisions except those that are intentional.

This is essentially what STRef and IORef do, except they hoist the actual value into the monad's state and hide the identities from you. The only downside of using these is you'll need to make much of your code monadic, but you're probably not going to get away from that easily whatever you do. (Modifying values rather than replacing them is an inherently effectful thing to do.)

The structure you gave was not specified in much detail so it's impossible to tailor an example to your use case, but here's a simple example using the ST monad and a Tree:

import Control.Monad
import Control.Monad.ST
import Data.Tree
import Data.Traversable (traverse)
import Data.STRef

createInitialTree :: ST s (Tree (STRef s String))
createInitialTree = do
  [a, b, c, d, e, f] <- mapM newSTRef ["A", "B", "C", "D", "E", "F"]
  return $ Node a [ Node b [Node d [], Node e []]
                  , Node c [Node e [], Node f []]
                  ]

dereferenceTree :: Tree (STRef s a) -> ST s (Tree a)
dereferenceTree = traverse readSTRef

test :: ST s (Tree String, Tree String)
test = do
  tree <- createInitialTree
  before <- dereferenceTree tree
  let leftE = subForest (subForest tree !! 0) !! 1
  writeSTRef (rootLabel leftE) "new"  -- look ma, single update!
  after <- dereferenceTree tree
  return (before, after)

main = do
  let (before, after) = runST test
  putStrLn $ drawTree before
  putStrLn $ drawTree after

Observe that although we only explicitly modified the value of the left E value, it changed on the right side, too, as desired.


I should note that these are not the only ways. There are probably many other solutions to this same problem, but they all require you to define identity sensibly. Only once that has been done can one begin the next step.

Upvotes: 3

Zeta
Zeta

Reputation: 105876

I would flatten the data structure to an array, and operate on this instead:

import Data.Array

type Tree = Array Int -- Bounds should start at (1) and go to sum [1..n]
data TreeTraverse = TLeft TreeTraverse | TRight TreeTraverse | TStop

Given some traverse directions (left, right, stop), it's easy to see that if we go left, we simply add the current level to our position, and if we go right, we also add the current position plus one:

getPosition :: TreeTraverse -> Int
getPosition = getPosition' 1 1
  where 
    getPosition' level pos (TLeft ts)  = getPosition' (level+1) (pos+level) ts
    getPosition' level pos (TRight ts) = getPosition' (level+1) (pos+level + 1) ts
    getPosition' _     pos (TStop)     = pos

In your case, you want to traverse either ABE or ACE:

traverseABE = TLeft $ TRight TStop
traverseACE = TRight $ TLeft TStop

Since we already now how to get the position of your element, and Data.Array provides some functions to set/get specific elements, we can use the following functions to get/set tree values:

getElem :: TreeTraverse -> Tree a -> a
getElem tt t = t ! getPosition tt

setElem :: TreeTraverse -> Tree a -> a -> Tree a
setElem tt t x = t // [(getPosition tt, x)]

To complete the code, lets use your example:

example = "ABCDEF"

exampleTree :: Tree Char
exampleTree = listArray (1, length example) example

And put everything to action:

main :: IO ()
main = do
  putStrLn $ "Traversing from A -> B -> E: " ++ [getElem traverseABE exampleTree]
  putStrLn $ "Traversing from A -> C -> E: " ++ [getElem traverseACE exampleTree]
  putStrLn $ "exampleTree: " ++ show exampleTree ++ "\n"

  putStrLn $ "Setting element from A -> B -> E to 'X', "
  let newTree = setElem traverseABE exampleTree 'X'
  putStrLn $ "but show via A -> C -> E: " ++ [getElem traverseACE newTree]
  putStrLn $ "newTree: " ++ show newTree ++ "\n"

Note that this is most-likely not the best way to do this, but the first thing that I had in mind.

Upvotes: 3

Related Questions