Kevin Meredith
Kevin Meredith

Reputation: 41939

Testing Recursive Data Structure

ScalaCheck: The Definitive Guide explains how to create generators for recursive data structures.

First, it defines the data structure:

trait Tree[T] {
    def size: Int
}
case class Leaf[T](item: T) extends Tree[T] {
    def size = 1
}
case class Node[T] (children: List[Tree[T]]) extends Tree[T] {
    def size = children.map(_.size).sum
}

Next, it shows the Gen[Tree[A]] code:

import org.scalacheck.Gen
import org.scalacheck.Gen.{oneOf, listOf, lzy}

def genTree[T](genT: Gen[T]): Gen[Tree[T]] = lzy {
    oneOf(genLeaf(genT), genNode(genT))
}

def genLeaf[T](genT: Gen[T]): Gen[Leaf[T]] =
    genT.map(Leaf(_))

def genNode[T](genT: Gen[T]): Gen[Node[T]] = for {
    children <listOf(
    genTree(genT))
} yield Node(children)

For the above generator, the book demonstrates, calling it can result in a StackOverflowError:

scala> genIntTree.sample
res0: Option[Tree[Int]] = Some(Leaf(2147483648))

scala> genIntTree.sample
res1: Option[Tree[Int]] = Some(Leaf(0))

scala> genIntTree.sample
java.lang.StackOverflowError
     at org.scalacheck.Gen$$anonfun$1$$anonfun$apply...

Given the following MyList data structure:

sealed abstract class MyList[+A]
case class Cons[+A](elem: A, rest: MyList[A]) extends MyList[A]
case object Empty                             extends MyList[Nothing]

And the following generator:

def genList[A](gen: Gen[A]): Gen[MyList[A]] =
    lzy { oneOf(genCons(gen), Gen.const(Empty)) } 

def genCons[A](gen: Gen[A]): Gen[MyList[A]] = for {
    list <- genList(gen)
    a    <- gen
} yield Cons(a, list)

My understanding is that Gen[Tree[A]]'s usage of listOf is responsible for the StackOverflowError.

However, is a StackOverflowError possible in the generator for Gen[MyList[A]] code?

I'm guessing that it is if enough genList returns enough Cons's, but I'm not sure.

Upvotes: 3

Views: 573

Answers (2)

johanneslink
johanneslink

Reputation: 5361

In your list example the probability of a stack overflow is very low — if existing at all. The reason — and the difference to the tree example — is that you cons only one element at a time.

Let’s say your stack would blow up after 1000 elements, the probability that this will occur is approx 1/(2^1000) which is a VERY small number.

Upvotes: 0

Bob Dalgleish
Bob Dalgleish

Reputation: 8227

Because the generator is recursive, it may well cause a stack overflow error. The issue is really that oneOf() is random in its selection of a path to explore; your random number generator drives the tree expansion.

I found that I could use a weighting to get trees of the depth I wanted. I believe that I played with frequency() to get the right weights to work.

Upvotes: 2

Related Questions