rmin
rmin

Reputation: 1128

Tips for forcing the scala compiler to use specialized methods

The scala standard library has some classes/methods specialized for primitives to avoid boxing, e.g.

def at3(f: Int => Boolean): Boolean = {
  f(3)
}

compiles to byte code:

  public boolean at3(scala.Function1<java.lang.Object, java.lang.Object>);
    Code:
       0: aload_1
       1: iconst_3
       2: invokeinterface #35,  2           // InterfaceMethod scala/Function1.apply$mcZI$sp:(I)Z
       7: ireturn

It's using Function1.apply$mcZI$sp(int): boolean instead of regular old Function1.apply(Object): boolean because that particular combo of types is specialized on Function1. That means no boxing.

However in my codebase, I have some "strong types" which are primitives with a trait mixed in that disappears at runtime. Here's a simplified recreation using Natural as an example:

sealed trait NaturalTag
type Natural = Int & NaturalTag
                                                                                                 
def int2NatUnsafe(i: Int): Natural =
  if (i >= 0) i.asInstanceOf[Natural]
  else throw new Exception(s"$i is negative, not a Natural")

// Example                                                                                                 
def at3Nat(f: Natural => Boolean): Boolean = {
  f(int2NatUnsafe(3)) // <--- f.apply(...)
}

and the example code decompiles to:

  public boolean at3Nat(scala.Function1<java.lang.Object, java.lang.Object>);
    Code:
       0: aload_1
       1: aload_0
       2: iconst_3
       3: invokevirtual #72                 // Method int2NatUnsafe:(I)I
       6: invokestatic  #78                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
       9: invokeinterface #82,  2           // InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
      14: invokestatic  #86                 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z
      17: ireturn

Unfortunately the compiler uses Function1.apply(Object): Object which causes boxing of the int and unboxing of the bool. I understand why it's doing it, but I'm wondering if there's a way to trick it into using the specialized method.

The compiler won't let me directly use the specialized method:

def at3Direct(f: Natural => Boolean): Boolean = {
  f.apply$mcZI$sp(3)
}
-- [E008] Not Found Error: Demo.scala:17:6 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------
17 |    f.apply$mcZI$sp(3)
   |    ^^^^^^^^^^^^^^^
   |    value apply$mcZI$sp is not a member of Demo.Natural => Boolean
1 error found

For some background, I'm using scala 3.5.0, and I'm aware that specialization is a bit vaguely defined in scala 3.

This kind of pattern was happening in a lot of really hot code paths that I'm trying to optimise in my app. I already have workarounds for it that avoid boxing related to inlining or casting the Function1, so I'm more just curious for general background information or tips.

I've played around a bit with opaque types as an alternative way to encode strong types, but they have their own issues. At least with opaque types the scala compiler understands very clearly that at runtime the opaque type will be the primitive it is hiding and it will specialize when applicable.

Thanks in advance for any tips or general background knowledge!

Upvotes: 5

Views: 42

Answers (1)

stefanobaghino
stefanobaghino

Reputation: 12794

Would an opaque type be suitable for you as a replacement for a tagged type? One of the reasons for their introduction was exactly avoiding boxing.

Given the following code:

sealed trait NaturalTag
type Natural = Int & NaturalTag

opaque type OpaqueNatural = Int

def int2NatUnsafe(i: Int): Natural =
  if (i >= 0) i.asInstanceOf[Natural]
  else throw new Exception(s"$i is negative, not a Natural")

def at3Nat(f: Natural => Boolean): Boolean = {
  f(int2NatUnsafe(3)) // <--- f.apply(...)
}

def at3OpaqueNat(f: OpaqueNatural => Boolean): Boolean = {
  f(3)
}

def at3(f: Int => Boolean): Boolean = {
  f(3)
}

The three functions above would decompile to the following:

  public boolean at3Nat(scala.Function1<java.lang.Object, java.lang.Object>);
    descriptor: (Lscala/Function1;)Z
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=3, locals=2, args_size=2
         0: aload_1
         1: aload_0
         2: iconst_3
         3: invokevirtual #59                 // Method int2NatUnsafe:(I)I
         6: invokestatic  #65                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
         9: invokeinterface #71,  2           // InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
        14: invokestatic  #75                 // Method scala/runtime/BoxesRunTime.unboxToBoolean:(Ljava/lang/Object;)Z
        17: ireturn
      LineNumberTable:
        line 11: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0      18     0  this   LNatural$package$;
            0      18     1     f   Lscala/Function1;
    Signature: #56                          // (Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;)Z
    MethodParameters:
      Name                           Flags
      f                              final

  public boolean at3OpaqueNat(scala.Function1<java.lang.Object, java.lang.Object>);
    descriptor: (Lscala/Function1;)Z
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=2, locals=2, args_size=2
         0: aload_1
         1: iconst_3
         2: invokeinterface #81,  2           // InterfaceMethod scala/Function1.apply$mcZI$sp:(I)Z
         7: ireturn
      LineNumberTable:
        line 14: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       8     0  this   LNatural$package$;
            0       8     1     f   Lscala/Function1;
    Signature: #56                          // (Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;)Z
    MethodParameters:
      Name                           Flags
      f                              final

  public boolean at3(scala.Function1<java.lang.Object, java.lang.Object>);
    descriptor: (Lscala/Function1;)Z
    flags: (0x0001) ACC_PUBLIC
    Code:
      stack=2, locals=2, args_size=2
         0: aload_1
         1: iconst_3
         2: invokeinterface #81,  2           // InterfaceMethod scala/Function1.apply$mcZI$sp:(I)Z
         7: ireturn
      LineNumberTable:
        line 18: 0
      LocalVariableTable:
        Start  Length  Slot  Name   Signature
            0       8     0  this   LNatural$package$;
            0       8     1     f   Lscala/Function1;
    Signature: #56                          // (Lscala/Function1<Ljava/lang/Object;Ljava/lang/Object;>;)Z
    MethodParameters:
      Name                           Flags
      f                              final
}

Notice that both at3OpaqueNat and at3 both decompile to the same, which seems to be what you were looking for.

Although the example might give the impression that you can use Int and OpaqueNatural interchangeably (which would defeat its purpose), IIUC this is only because we are in the same scope. When it's exposed to code outside of the defining scope, the alias becomes truly opaque. Given the following definition in a separate file:

def foo(): Unit = {
  at3OpaqueNat { n => n > 0 }
}

The compiler will complain that value > is not a member of OpaqueNatural.

Upvotes: 2

Related Questions