Dan
Dan

Reputation: 381

I need an algorithm to walk a tree with optional and alternative nodes to calculate all possible paths

Given the following tree which is a sort of Backus-Nauf form like notation where | indicates or (so B is F or G) and [] indicate optional (so H is optional)

Def: A B C
A: D E
B: F | G
C: [H] I
D: a b
E: c d
F: e f
G: g h
H: i j
I: k l

a: a
b: b
c: c
d: d
e: e
f: f
g: g
h: h
i: i
j: j
k: k
l: l    

Which can be viewed as

              Def
    A          B          C
 D     E    F  |  G   [H]    I
a b   c d  e f   g h  i j   k l

I need to walk the tree extracting the leaf nodes and convert to the following tree which gives the possible routes

 Def
     a
         b
             c
                 d
                     e
                         f
                             i
                                 j
                                     k
                                         l
                             k
                                 l
                     g
                         h
                             i
                                 j
                                     k
                                         l
                             k
                                 l

So the possible paths are

abcdefijkl

abcdefkl

abcdghijkl

abcdghkl

I've a repo with a failing C# unit test (that sets up the tree and calls a basic recusive walker) that should hopefully clarify what I'm trying to achieve.

What I can't figure out is how to branch at optional and alternative nodes while maintaining the correct leaves to append subsequent leaves to.

Upvotes: 1

Views: 190

Answers (2)

Jim Mischel
Jim Mischel

Reputation: 134045

There is another way to build a tree from your definitions. Consider:

Def: A B C
A: D E
B: F | G
C: [H] I

Start with

A
 \
  B
   \
    C

Then replace A with D E:

D
 \
  E
   \
    B
     \
      C

Do the same thing with B (replace with F | G), and C (replace with [H] I), and you get:

      D
       \
        E     
    /       \
   F         G
  / \       / \
 I   H     I   H
      \         \
       I         I

Now, if you do a recursive depth-first traversal of that tree, you get the valid strings:

D E F I
D E F H I
D E G I
D E G H I

And you can replace D with "a b", etc. when you're outputting the strings.

I showed it step-by-step, but you can do it all in a single pass.

Upvotes: 0

jingx
jingx

Reputation: 4014

A non-recursive breadth-first search would probably look something along these lines (pseudo code). Kicked off by calling findAllLeafPaths(Def):

var allPathsFound = {}
function findAllLeafPaths(startNode) {
    var tokenSequenceQueue = {
        createTokenSequenceFrom(startNode)
    }
    while (tokenSequenceQueue.isNotEmpty()) {
        var tokenSequence = tokenSequenceQueue.removeFirst()
        var allLeaves = true
        for (every token T in tokenSequence) {
            if isLeafNode(T)
                continue
            else if T's rule is "T: Y Z" {
                allLeaves = false
                tokenSequenceQueue.append(tokenSequence.replace(T, Y + Z))
            } else if T's rule is "T: [Y] Z" {
                allLeaves = false
                tokenSequenceQueue.append(tokenSequence.replace(T, Y + Z))
                tokenSequenceQueue.append(tokenSequence.replace(T, Z))
            } else if T's rule "T: Y | Z" {
                allLeaves = false
                tokenSequenceQueue.append(tokenSequence.replace(T, Y))
                tokenSequenceQueue.append(tokenSequence.replace(T, Z))
            }
        }
        if (allLeaves) {
            allPathsFound.add(tokenSequence)
        }
    }
}

Here is also a recursive depth-first version. I prefer the first one because the recursion puts your stack at the mercy of the max possible length of result paths:

var allPathsFound = {}
function toLeafNodes(tokenSequence) {
    var allLeaves = true
    for every token T in tokenSequence {
        if isLeafNode(T)
            continue
        else if T's rule is "T: Y Z" {
            allLeaves = false
            toLeafNodes(tokenSequence.replace(T, Y + Z)
        } else if T's rule is "T: [Y] Z" {
            allLeaves = false
            toLeafNode(tokenSequence.replace(T, Y + Z)
            toLeafNode(tokenSequence.replace(T, Z)
        } else if T's rule "T: Y | Z" {
            allLeaves = false
            toLeafNode(tokenSequence.replace(T, Y)
            toLeafNode(tokenSequence.replace(T, Z)
        }
    }
    if (allLeaves) {
        allPathsFound.add(tokenSequence)
    }
}

[Edit] The non-recursive version currently does one replace at a time and immediately puts the result sequence on the queue. It can be further optimized to do all possible replaces in one pass.

Upvotes: 1

Related Questions