Michael
Michael

Reputation: 42050

How to optimize building a tree from a list of the node paths?

Suppose I am writing a function fromPaths(paths: List[String]): Node to build a tree from a few node paths like this:

case class Node(value: String, children: List[Node])
val paths = List("a/b/x", "a/b/y", "a/c", "a/c/d")
fromPaths(paths) // Node("a", List(Node("b", List(Node("x"), Node("y"))), Node("c", List(Node("d")))))

I can write a function addNode(root: Node, path: String): Node and then just fold it over the list. However it looks suboptimal since we traverse the tree from root to node for each "path" in paths.

How would you optimize fromPaths for the overall number of traversed nodes ?

Upvotes: 0

Views: 154

Answers (1)

Andrey Tyukin
Andrey Tyukin

Reputation: 44918

The obvious recursion seems to work just fine, at least on your examples:

case class Node[A](a: A, children: List[Node[A]])
def trieForest[A](lists: List[List[A]]): List[Node[A]] = {
  lists.filter(_.nonEmpty).groupBy(_.head).map { 
    case (k, vs) =>
    Node(k, trieForest(vs.map(_.tail)))
  }.toList
}

def fromPaths(paths: List[String]): Node[String] = {
  // assumes that there is only one tree in the 
  // top-level forest.
  trieForest(paths.map(_.split("/").toList)).head
}

println(fromPaths(List("a/b/x", "a/b/y", "a/c", "a/c/d")))

Prints (up to indentation):

Node(a, List(
  Node(b,List(
    Node(y,List()), 
    Node(x,List())
  )),
  Node(c,List(
    Node(d,List())
  ))
))

It can't run much faster asymptotically, because you have to look at every part of the input at least once.

Upvotes: 2

Related Questions