Reputation: 34423
I am experiencing a boxing issue which affects negatively performance of my Scala code. I have extracted the relevant code, which still shows the issue, with some added strangeness. I have the following representation of a 2D Double array which allows me to perform transformations on it by providing my functions:
case class Container(
a: Array[Array[Double]] = Array.tabulate[Double](10000, 10000)((x,y) => x.toDouble * y)
) {
def transformXY(f: (Double, Double, Double) => Double): Container = {
Container(Array.tabulate[Double](a.length, a.length) { (x, y) =>
f(x, y, a(x)(y))
})
}
def transform(f: Double => Double): Container = {
Container(Array.tabulate[Double](a.length, a.length) { (x, y) =>
f(a(x)(y))
})
}
}
Following code reproduces the issue for me:
object Main extends App {
def now = System.currentTimeMillis()
val iters = 3
def doTransformsXY() = {
var t = Container()
for (i <- 0 until iters) {
val start = now
t = t.transformXY { (x, y, h) =>
h + math.sqrt(x * x + y * y)
}
println(s"transformXY: Duration ${now - start}")
}
}
def doTransforms() = {
var t = Container()
for (i <- 0 until iters) {
val start = now
t = t.transform { h =>
h + math.sqrt(h * h * h)
}
println(s"transform: Duration ${now - start}")
}
}
if (true) { // Shows a lot of boxing if enabled
doTransformsXY()
}
if (true) { // Shows a lot of boxing again - if enabled
doTransformsXY()
}
if (true) { // Shows java8.JFunction...apply()
doTransforms()
}
if (true) { // Shows java8.JFunction...apply() if doTransforms() is enabled
doTransformsXY()
}
}
When I run this code and sample it using Java VisualVM, I experience the following:
doTransformsXY
is running, I see a lot of time spent in scala.runtime.BoxesRunTime.boxToDouble()
doTransforms
is running, there is no more significant time spend it boxing, samples show scala.runtime.java8.JFunction2$mcDII$sp.apply()
insteaddoTransformsXY
again, there is still no significant boxing, again time grows in scala.runtime.java8.JFunction2$mcDII$sp.apply()
This is with Scala 2.12.4, Windows x64 jdk1.8.0_92
My primary question is about the boxing, which I see in my production code as well:
Double
boxing happening in Array.tabulate
? Do I need to go procedural (while loops, manual Array
creation) to avoid it?My secondary question is:
transform
variant?Upvotes: 0
Views: 201
Reputation: 7353
why is no more boxing done once I call the transform variant?
I did not reproduce that. If I carefully pause VMs and check with JProfiler, it still does a lot of boxing and allocation of Doubles. Which is what I expected, and I have an explanation for.
Looking at the Function1
and Function2
traits in standard library, we can see @specialized annotations:
trait Function1[@specialized(Int, Long, Float, Double) -T1, @specialized(Unit, Boolean, Int, Float, Long, Double) +R]
trait Function2[@specialized(Int, Long, Double) -T1, @specialized(Int, Long, Double) -T2, @specialized(Unit, Boolean, Int, Float, Long, Double) +R]
but the Function3
is just
trait Function3[-T1, -T2, -T3, +R]
@specialized
is how Scala lets you avoid boxing on generics with primitives. But this comes at a price of compiler having to generate additional methods and classes, so beyond a certain threshold it will just produce a ridiculous amount of code (if not crash outright). So Function
has, if my math is correct, 4 (specs on T1) x 6 (specs on R) = 24 copies of each specialized method and 24 extra classes in addition to just apply
and a generic trait.
Oh, and by the way, those methods are postfixed with $mc
and the JNI type signatures. So method ending in $mcDII
is a specialized overload that returns a Double, and accepts two Ints as parameters. This is the type of function you're passing into tabulate
inside transform, i.e. this part
(x, y) => f(a(x)(y))
While calls to f
should show up with $mcDD
postfix (returns a Double and accepts a Double).
However, calling
f(x, y, a(x)(y))
will become something like
unbox(f(box(x), box(y), box(a(x)(y))))
So I bothered you enough with the explanation. It's time for solution. To bring boxing of both methods to equivalent shape, create a specialized interface:
trait DoubleFunction3 {
def apply(a: Double, b: Double, c: Double): Double
}
and rewrite your signature in transformXY
def transformXY(f: DoubleFunction3): Container = //... same code
Since it's Scala 2.12 and you have just one abstract method in the trait, you can still pass lambdas, so this code:
t = t.transformXY { (x, y, h) =>
h + math.sqrt(x * x + y * y)
}
requires no change.
Now you might notice that this does not fully eliminate boxing because tabulate
causes it too. This is definition of a single-dimensional tabulate
:
def tabulate[T: ClassTag](n: Int)(f: Int => T): Array[T] = {
val b = newBuilder[T]
b.sizeHint(n)
var i = 0
while (i < n) {
b += f(i)
i += 1
}
b.result()
}
Note that it works with a generic Builder[T]
, calling a method +=(elem: T)
. Builder
itself is not specialized, so it will do wasteful boxing/unboxing when creating your arrays. Your fix for this is to write a version that directly uses Double
instead of T
, for the dimensions you need.
Upvotes: 1