Reputation: 169
Since the scala RedBlack tree is no longer available in the scala collections, I'm having trouble figuring out how to maintain properties about tree in the nodes on a scala tree, as I need direct access to the tree rather than the encaspulated interface provided by the standard collections.
For example, I want a tree that maintains on each node a field x such that the function f holds for each node n such that
f(null) = 0
f(n) = n.x + f(n.left) + f(n.right)
Can I efficiently use a Scala standard library TreeMap for this or will I have to implement my own tree (currently doing this)?
Upvotes: 1
Views: 78
Reputation: 12565
Unfortunately, I don't think you will be able to do this using TreeMap. You might want to just copy the red black tree from scala collections and add a summary field.
Alternatively, you might want to use one of the tree-based collections that allow specifying a "measure". E.g. https://github.com/Sciss/FingerTree
Or, you could use this horrible hack to add some summary data to the nodes of e.g. a TreeSet. Note the namespace scala.collection.immutable so you can access RedBlackTree from inside the utility.
package scala.collection.immutable
trait Summary[E, S] {
def apply(v: E): S
def empty: S
def combine(a: S, b: S): S
def combine3(a: S, b: S, c: S) = combine(combine(a, b), c)
}
object TreeSetSummarizer {
private[this] val treeSetAccessor = classOf[scala.collection.immutable.TreeSet[_]].getDeclaredField("tree")
treeSetAccessor.setAccessible(true)
private def tree[K](set: TreeSet[K]): AnyRef = treeSetAccessor.get(set) match {
case t: RedBlackTree.Tree[K, Unit] ⇒ t
case _ ⇒ "null"
}
private type JFunction[T, R] = java.util.function.Function[T, R]
def apply[K, S](implicit s: Summary[K, S]): (TreeSet[K] ⇒ S) = new TreeSetSummarizer[K, S]
}
class TreeSetSummarizer[K, S](implicit summary: Summary[K, S]) extends (TreeSet[K] ⇒ S) {
import TreeSetSummarizer._
// this should be a guava cache using weak keys to prevent memory leaks
private val memo = new java.util.IdentityHashMap[AnyRef, S]()
private val f: JFunction[AnyRef, S] = new JFunction[AnyRef, S] {
def apply(t: AnyRef): S = t match {
case t: RedBlackTree.Tree[K, Unit] ⇒ summary.combine3(apply0(t.left), summary.apply(t.key), apply0(t.right))
case _ ⇒ summary.empty
}
}
private def apply0(set: AnyRef): S =
memo.computeIfAbsent(set, f)
def apply(set: TreeSet[K]): S =
apply0(tree(set))
}
Here is how you would use it
import scala.collection.immutable._
// create a large TreeSet
val largeSet = TreeSet(0 until 10000: _*)
// define your summary
implicit val s = new Summary[Int, Long] {
def empty = 0L
def apply(x: Int) = x
def combine(a: Long, b: Long) = a + b
}
// define your summarizer. You need to keep the summarizer instance for
// having a performance benefit, since it internally stores summaries for
// tree nodes in an identity hash map
val summary = TreeSetSummarizer.apply[Int, Long]
// summarize something
println(summary(largeSet))
// summarize a modified set. This is fast because the summaries for tree
// nodes are being cached.
val largeSet1 = largeSet - 5000
println(summary(largeSet1))
Note that given the reflection and hashing overhead, this is probably best used for more computationally intensive summaries than a simple sum.
Update: I have now written a small library persistentsummary to define persistent summaries for existing scala collections. This should do exactly what you want.
Upvotes: 1