Travis Brown
Travis Brown

Reputation: 139038

Scala macros and the JVM's method size limit

I'm replacing some code generation components in a Java program with Scala macros, and am running into the Java Virtual Machine's limit on the size of the generated byte code for individual methods (64 kilobytes).

For example, suppose we have a large-ish XML file that represents a mapping from integers to integers that we want to use in our program. We want to avoid parsing this file at run time, so we'll write a macro that will do the parsing at compile time and use the contents of the file to create the body of our method:

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  val mapping = List.tabulate(7000)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    c.Expr(Match(Annotated(switch, i.tree), cases))
  }
}

In this case the compiled method would be just over the size limit, but instead of a nice error saying that, we're given a giant stack trace with a lot of calls to TreePrinter.printSeq and are told that we've slain the compiler.

I have a solution that involves splitting the cases into fixed-sized groups, creating a separate method for each group, and adding a top-level match that dispatches the input value to the appropriate group's method. It works, but it's unpleasant, and I'd prefer not to have to use this approach every time I write a macro where the size of the generated code depends on some external resource.

Is there a cleaner way to tackle this problem? More importantly, is there a way to deal with this kind of compiler error more gracefully? I don't like the idea of a library user getting an unintelligible "That entry seems to have slain the compiler" error message just because some XML file that's being processed by a macro has crossed some (fairly low) size threshhold.

Upvotes: 36

Views: 3821

Answers (2)

Ondra Žižka
Ondra Žižka

Reputation: 46796

Imo putting data into .class isn't really a good idea. They are parsed as well, they're just binary. But storing them in JVM may have negative impact on performance of the garbagge collector and JIT compiler.

In your situation, I would pre-compile the XML into a binary file of proper format and parse that. Elligible formats with existing tooling can be e.g. FastRPC or good old DBF. Or maybe pre-fill an ElasticSearch repository if you need quick advanced lookups and searches. Some implementations of the latter may also provide basic indexing which could even leave the parsing out - the app would just read from the respective offset.

Upvotes: 10

som-snytt
som-snytt

Reputation: 39577

Since somebody has to say something, I followed the instructions at Importers to try to compile the tree before returning it.

If you give the compiler plenty of stack, it will correctly report the error.

(It didn't seem to know what to do with the switch annotation, left as a future exercise.)

apm@mara:~/tmp/bigmethod$ skalac bigmethod.scala ; skalac -J-Xss2m biguser.scala ; skala bigmethod.Test
Error is java.lang.RuntimeException: Method code too large!
Error is java.lang.RuntimeException: Method code too large!
biguser.scala:5: error: You ask too much of me.
  Console println s"5 => ${BigMethod.lookup(5)}"
                                           ^
one error found

as opposed to

apm@mara:~/tmp/bigmethod$ skalac -J-Xss1m biguser.scala 
Error is java.lang.StackOverflowError
Error is java.lang.StackOverflowError
biguser.scala:5: error: You ask too much of me.
  Console println s"5 => ${BigMethod.lookup(5)}"
                                           ^

where the client code is just that:

package bigmethod

object Test extends App {
  Console println s"5 => ${BigMethod.lookup(5)}"
}

My first time using this API, but not my last. Thanks for getting me kickstarted.

package bigmethod

import scala.language.experimental.macros
import scala.reflect.macros.Context

object BigMethod {
  // For this simplified example we'll just make some data up.
  //final val size = 700
  final val size = 7000
  val mapping = List.tabulate(size)(i => (i, i + 1))

  def lookup(i: Int): Int = macro lookup_impl
  def lookup_impl(c: Context)(i: c.Expr[Int]): c.Expr[Int] = {

    def compilable[T](x: c.Expr[T]): Boolean = {
      import scala.reflect.runtime.{ universe => ru }
      import scala.tools.reflect._
      //val mirror = ru.runtimeMirror(c.libraryClassLoader)
      val mirror = ru.runtimeMirror(getClass.getClassLoader)
      val toolbox = mirror.mkToolBox()
      val importer0 = ru.mkImporter(c.universe)
      type ruImporter = ru.Importer { val from: c.universe.type }
      val importer = importer0.asInstanceOf[ruImporter]
      val imported = importer.importTree(x.tree)
      val tree = toolbox.resetAllAttrs(imported.duplicate)
      try {
        toolbox.compile(tree)
        true
      } catch {
        case t: Throwable =>
          Console println s"Error is $t"
          false
      }
    }
    import c.universe._

    val switch = reify(new scala.annotation.switch).tree
    val cases = mapping map {
      case (k, v) => CaseDef(c.literal(k).tree, EmptyTree, c.literal(v).tree)
    }

    //val res = c.Expr(Match(Annotated(switch, i.tree), cases))
    val res = c.Expr(Match(i.tree, cases))

    // before returning a potentially huge tree, try compiling it
    //import scala.tools.reflect._
    //val x = c.Expr[Int](c.resetAllAttrs(res.tree.duplicate))
    //val y = c.eval(x)
    if (!compilable(res)) c.abort(c.enclosingPosition, "You ask too much of me.")

    res
  }
}

Upvotes: 4

Related Questions