Boxing Double during Array.tabulate

Question

I am experiencing a boxing issue which affects negatively performance of my Scala code. I have extracted the relevant code, which still shows the issue, with some added strangeness. I have the following representation of a 2D Double array which allows me to perform transformations on it by providing my functions:

case class Container(
  a: Array[Array[Double]] = Array.tabulate[Double](10000, 10000)((x,y) => x.toDouble * y)
) {
  def transformXY(f: (Double, Double, Double) => Double): Container = {
    Container(Array.tabulate[Double](a.length, a.length) { (x, y) =>
      f(x, y, a(x)(y))
    })
  }

  def transform(f: Double => Double): Container = {
    Container(Array.tabulate[Double](a.length, a.length) { (x, y) =>
      f(a(x)(y))
    })
  }
}

Following code reproduces the issue for me:

object Main extends App {

  def now = System.currentTimeMillis()

  val iters = 3

  def doTransformsXY() = {
    var t = Container()
    for (i <- 0 until iters) {
      val start = now
      t = t.transformXY { (x, y, h) =>
        h + math.sqrt(x * x + y * y)
      }
      println(s"transformXY: Duration ${now - start}")
    }
  }

  def doTransforms() = {
    var t = Container()
    for (i <- 0 until iters) {
      val start = now
      t = t.transform { h =>
        h + math.sqrt(h * h * h)
      }
      println(s"transform: Duration ${now - start}")
    }
  }

  if (true) { // Shows a lot of boxing if enabled
    doTransformsXY()
  }

  if (true) { // Shows a lot of boxing again - if enabled
    doTransformsXY()
  }

  if (true) { // Shows java8.JFunction...apply()
    doTransforms()
  }

  if (true) { // Shows java8.JFunction...apply() if doTransforms() is enabled
    doTransformsXY()
  }

}

When I run this code and sample it using Java VisualVM, I experience the following:

while doTransformsXY is running, I see a lot of time spent in scala.runtime.BoxesRunTime.boxToDouble()
once doTransforms is running, there is no more significant time spend it boxing, samples show scala.runtime.java8.JFunction2$mcDII$sp.apply() instead
I run doTransformsXY again, there is still no significant boxing, again time grows in scala.runtime.java8.JFunction2$mcDII$sp.apply()

This is with Scala 2.12.4, Windows x64 jdk1.8.0_92

My primary question is about the boxing, which I see in my production code as well:

why is there Double boxing happening in Array.tabulate? Do I need to go procedural (while loops, manual Array creation) to avoid it?

My secondary question is:

why is no more boxing done once I call the transform variant?

Oleg Pyzhcov · Accepted Answer

why is no more boxing done once I call the transform variant?

I did not reproduce that. If I carefully pause VMs and check with JProfiler, it still does a lot of boxing and allocation of Doubles. Which is what I expected, and I have an explanation for.

Looking at the Function1 and Function2 traits in standard library, we can see @specialized annotations:

trait Function1[@specialized(Int, Long, Float, Double) -T1, @specialized(Unit, Boolean, Int, Float, Long, Double) +R]
trait Function2[@specialized(Int, Long, Double) -T1, @specialized(Int, Long, Double) -T2, @specialized(Unit, Boolean, Int, Float, Long, Double) +R]

but the Function3 is just

trait Function3[-T1, -T2, -T3, +R]

@specialized is how Scala lets you avoid boxing on generics with primitives. But this comes at a price of compiler having to generate additional methods and classes, so beyond a certain threshold it will just produce a ridiculous amount of code (if not crash outright). So Function has, if my math is correct, 4 (specs on T1) x 6 (specs on R) = 24 copies of each specialized method and 24 extra classes in addition to just apply and a generic trait.

Oh, and by the way, those methods are postfixed with $mc and the JNI type signatures. So method ending in $mcDII is a specialized overload that returns a Double, and accepts two Ints as parameters. This is the type of function you're passing into tabulate inside transform, i.e. this part

(x, y) => f(a(x)(y))

While calls to f should show up with $mcDD postfix (returns a Double and accepts a Double).

However, calling

f(x, y, a(x)(y))

will become something like

unbox(f(box(x), box(y), box(a(x)(y))))

So I bothered you enough with the explanation. It's time for solution. To bring boxing of both methods to equivalent shape, create a specialized interface:

trait DoubleFunction3 {
  def apply(a: Double, b: Double, c: Double): Double
}

and rewrite your signature in transformXY

def transformXY(f: DoubleFunction3): Container = //... same code

Since it's Scala 2.12 and you have just one abstract method in the trait, you can still pass lambdas, so this code:

  t = t.transformXY { (x, y, h) =>
    h + math.sqrt(x * x + y * y)
  }

requires no change.

Now you might notice that this does not fully eliminate boxing because tabulate causes it too. This is definition of a single-dimensional tabulate:

  def tabulate[T: ClassTag](n: Int)(f: Int => T): Array[T] = {
    val b = newBuilder[T]
    b.sizeHint(n)
    var i = 0
    while (i < n) {
      b += f(i)
      i += 1
    }
    b.result()
  }

Note that it works with a generic Builder[T], calling a method +=(elem: T). Builder itself is not specialized, so it will do wasteful boxing/unboxing when creating your arrays. Your fix for this is to write a version that directly uses Double instead of T, for the dimensions you need.

Boxing Double during Array.tabulate

Answers (1)

Related Questions