Reputation: 41919
ScalaCheck's Gen
API docs explain lazy val sized
:
def sized[T](f: (Int) ⇒ Gen[T]): Gen[T]
Creates a generator that can access its generation size
Looking at the following example:
import org.scalacheck.Gen.{sized, posNum}
scala> sized( s => { println(s); posNum[Int] })
res12: org.scalacheck.Gen[Int] = org.scalacheck.Gen$$anon$3@6a071cc0
scala> res12.sample
100
res13: Option[Int] = Some(12)
scala> res12.sample
100
res14: Option[Int] = Some(40)
What is the meaning of generation size
, namely 100 in the above output?
Upvotes: 5
Views: 1532
Reputation:
sized
provides access to the "size" parameter of Scalacheck. This parameter indicates how "big" the values generated by a generator should be. This parameter is useful in a couple of situations:
The companion of Gen.sized
is Gen.resize
which lets you change the size of a generator, as in Gen.resize(10, Gen.alphaNumString)
which will generate a alpha-numeric string of no more than ten characters.
Most built-in generators use sized
in some way, for instance Gen.buildableOf
(which is the underpinning of all generators for lists and containers):
[…] The size of the container is bounded by the size parameter used when generating values.
To get an idea of how Gen.size
is used take a look at the example in "Sized Generators" (Generators, Scalacheck user guide):
def matrix[T](g: Gen[T]): Gen[Seq[Seq[T]]] = Gen.sized { size =>
val side = scala.math.sqrt(size).asInstanceOf[Int]
Gen.listOfN(side, Gen.listOfN(side, g))
}
This generator uses the "size" to limit the dimensions of the matrix so that the entire matrix will never have more than entries than the "size" parameter. In other words, with a size of 100 as in your question, the generated matrix would have 10 rows and 10 columns, amounting to a total of 100 entries.
"size" is particularly useful to make sure that generators for recursive data structures terminate. Consider the following example which generates instances of a binary tree and uses size
to limit the height of each branch to make sure that the generator terminates at some point:
import org.scalacheck.Gen
import org.scalacheck.Arbitrary.arbitrary
sealed abstract class Tree
case class Node(left: Tree, right: Tree, v: Int) extends Tree
case object Leaf extends Tree
val genLeaf = Gen.const(Leaf)
val genNode = for {
v <- arbitrary[Int]
left <- Gen.sized(h => Gen.resize(h/2, genTree))
right <- Gen.sized(h => Gen.resize(h/2, genTree))
} yield Node(left, right, v)
def genTree: Gen[Tree] = Gen.sized { height =>
if (height <= 0) {
genLeaf
} else {
Gen.oneOf(genLeaf, genNode)
}
}
Note how the generator for nodes recursively generates trees, but allows them only half of the "size". The tree generator in turn will only generate leafs if its size is exhausted. Thus the "size" of the generator is an upper bound for the height of the generated tree, ensuring that the generator terminates at some point and does not generate excessively large trees.
Note that the size only sets an upper bound for the height of the tree in this example. It does not affect the balancing of the generated tree or the likeliness of generating a tree with a certain depth. These depend solely on the bias defined in genTree
.
With oneOf
each subtree has a 50% chance of being just a Leaf, ending the growth of the tree at this branch, which makes generating a complete tree that exhausts the "whole" size somewhat unlikely.
frequency
(see below) let's you encode a different bias. In the example below nodes a way more likely than leafs, so the tree generated by the generator below is more likely to grow, but it's still unlikely to be complete.
Gen.frequency
is for a different use case: You would not use it to limit the depths or size of a data structure, but to add a certain bias to a choice of generators. Take a look at the definition of Gen.option
:
def option[T](g: Gen[T]): Gen[Option[T]] =
frequency(1 -> const(None), 9 -> some(g))
This definition uses frequency
to make the less-interesting case of None
less likely than the more interesting case of Some
.
In fact, we could combine Gen.sized
and Gen.frequency
in our binary tree example above to make genTree
more likely to generate "interesting" nodes rather than "boring" leafs:
def genTree: Gen[Tree] = Gen.sized { height =>
if (height <= 0) {
genLeaf
} else {
Gen.frequency(1 -> genLeaf, 9 -> genNode)
}
}
Upvotes: 13
Reputation: 30320
Generation size is the number of results the generator will produce. The sized
method simply lets you write generators that know their own size so you can use that information as a factor in what you generate.
For example, this generator (from this resource) produces two lists of numbers where 1/3 are positive and 2/3 are negative:
import org.scalacheck.Gen
import org.scalacheck.Prop.forAll
val myGen = Gen.sized { size =>
val positiveNumbers = size / 3
val negativeNumbers = size * 2 / 3
for {
posNumList <- Gen.listOfN(positiveNumbers, Gen.posNum[Int])
negNumList <- Gen.listOfN(negativeNumbers, Gen.posNum[Int] map (n => -n))
} yield (size, posNumList, negNumList)
}
check {
forAll(myGen) {
case (genSize, posN, negN) =>
posN.length == genSize / 3 && negN.length == genSize * 2 / 3
}
}
So sort of like zipWithIndex
in Scala collections, sized
just provides you meta information to help you do what you need to do.
Upvotes: 1