Ukonn Ra
Ukonn Ra

Reputation: 754

High performance wrapper for Scala-to-Java collections conversion

Say I have a Java class with some business logic:

package examples;

import java.util.List;
import java.util.Map;

public class Inner {
    public void consume(Map<Integer, List<Double>> map) {
        map.forEach((k, v) -> {
            System.out.println("Key: " + k);
            v.forEach(i -> System.out.println("  item: " + i));
        });
    }
}

Now I want to write a high performance, as native as possible Scala wrapper(because this wrapper may be called in high frequency), so:

The First Attempt

package examples

class Wrapper(val asJava: Inner) extends AnyVal {
  implicit def consume(map: Map[Int, List[Double]]): Unit = asJava.consume(map)
}

But got errors:

[error]  found   : Map[Int,List[Double]]               (in scala.collection.immutable)
[error]  required: Map[Integer,java.util.List[Double]] (in java.util)
[error]   implicit def consume(map: Map[Int, List[Double]]): Unit = asJava.consume(map)
[error]                                                                            ^
[error] one error found

The Second Attempt

package examples

import scala.jdk.CollectionConverters._

class Wrapper(val asJava: Inner) extends AnyVal {
  implicit def consume(map: Map[Int, List[Double]]): Unit = asJava.consume(map.asJava)
}

Still got errors:

[error]  found   : java.util.Map[Int,List[scala.Double]]
[error]  required: java.util.Map[Integer,java.util.List[java.lang.Double]]
[error]   implicit def consume(map: Map[Int, List[Double]]): Unit = asJava.consume(map.asJava)
[error]                                                                                ^
[error] one error found

And I am afraid that scala.jdk.CollectionConverters._ may allocate extra memory and waste a lot of time

Questions

  1. Can I use scala.jdk.CollectionConverters._ in high performance scenario?
  2. In Kotlin, all of the primitive types and most of collections can be converted to / from their Java alternative with almost no performance lose, so I wonder how to achieve the goal in Scala?

Upvotes: 1

Views: 914

Answers (3)

Mike Allen
Mike Allen

Reputation: 8299

UPDATE 1: I've expanded on the answer to provide more detail and explanation.

Use Integer instead of Int, and java.lang.Double instead of scala.Double (which is what Double is interpreted to be in Scala) in your Scala definitions.

The problem is that Java does not allow the use of primitives in collections, which is why it uses Integer instead of int as the type, and Double instead of double. In Scala, the treatment of Int, Double, etc. is more complex. It will treat them as primitives (i.e., like Java's int, double, etc.) in most cases, but it will box (and unbox) them when used with collections.

However, when inter-operating with Java, it's necessary to be more explicit. Scala will implicitly convert Integer instances to Int instances—and vice versa—when necessary, but it cannot implicitly convert collections of boxed primitives.

You can use the Java collections in Scala just fine, with zero overhead; you don't need to convert them to Scala equivalents at all. However, you will obviously need to perform conversions if you pass them to code that expects Scala's equivalent collections, or if you want to use Scala collections with your legacy Java code. (In case it's not clear, a java.util.Map is not the same as a scala.collection.immutable.Map, nor is a java.util.List the same as a scala.collection.immutable.List. The Scala collections are immutable and designed for efficient use using a functional programming paradigm.)

If you're concerned about performance, you should use a micro-benchmarking toolkit, such as ScalaMeter, which will allow you to measure, precisely, comparisons of using the Java collections natively versus converting to/from Scala collections.

UPDATE 2:

I've re-written your first attempt, using Java's collections. To avoid confusion with Scala types, I've renamed the clashing Java types by prefixing them with J:

package examples

import java.lang.{Double => JDouble}
import java.util.{List => JList, Map => JMap}
import scala.language.implicitConversions

class Wrapper(val asJava: Inner) {
  implicit def consume(map: JMap[Integer, JList[JDouble]]): Unit = {
    asJava.consume(map)
  }
}

This uses the Java Map and List collections, as well as the boxed primitives. It compiles, but it buys you absolutely nothing (the signature of Wrapper.consume is exactly the same as Inner.consume). However, it does illustrate how to use the Java collections in situ in Scala.

If you want to use Scala collections, and convert to the Java equivalents, which I think is the intention of your second attempt, then that would look like this:

package examples

import java.lang.{Double => JDouble}
import scala.jdk.CollectionConverters._
import scala.language.implicitConversions

class Wrapper(val asJava: Inner) {
  implicit def consume(map: Map[Int, List[Double]]): Unit = {
    val jmap = map.asInstanceOf[Map[Integer, List[JDouble]]]
    asJava.consume(jmap.map(p => p._1 -> p._2.asJava).asJava)
  }
}

In this case, we first have to cast the collection to use the boxed types (the definitions are equivalent under the hood), which has no overhead as such.

Then we have to convert the Scala data structures to the equivalent Java data structures. There's clearly an overhead in doing this, if only to wrap the Scala data structures in a Java interface.

UPDATE 3

I should also point out that making the Wrapper.consume method implicit is an odd choice. I think you maybe intended to make the class implicit, so that (in the second attempt) you can implicitly consume Scala data structures with an Inner instance. (In effect, Inner gets decorated by the functions provided in Wrapper.) Inner instances will be implicitly converted to Wrapper instances, with no overhead (this will not even create an instance of Wrapper in most cases, because of the extending AnyVal clause):

import java.lang.{Double => JDouble}
import scala.jdk.CollectionConverters._
import scala.language.implicitConversions

package object examples {
  implicit class Wrapper(val asJava: Inner) extends AnyVal {
    def consume(map: Map[Int, List[Double]]): Unit = {
      val jmap = map.asInstanceOf[Map[Integer, List[JDouble]]]
      asJava.consume(jmap.map(p => p._1 -> p._2.asJava).asJava)
    }
  }
}

Note that Wrapper now needs to be defined inside a package object. (This is because implicit classes can only be defined within an object of some kind, which must be brought into scope, and a package object is generally the most convenient way to achieve this.)

Upvotes: 5

Mario Galic
Mario Galic

Reputation: 48430

Here is a jmh benchmark indicating scala.jdk.CollectionConverters have negligible cost

@State(Scope.Benchmark)
@BenchmarkMode(Array(Mode.Throughput))
class So59827649 {
  def consumeScala(inner: Inner, map: Map[Int, List[Double]]): Unit = {
    val jmap = map.asInstanceOf[Map[Integer, List[java.lang.Double]]];
    inner.consume(jmap.map(p => p._1 -> p._2.asJava).asJava)
  }

  def consumeJava(inner: Inner, map: java.util.Map[Integer, java.util.List[java.lang.Double]]): Unit = {
    inner.consume(map)
  }

  val inner = new Inner
  val size = 1000
  val scalaMap = (1 to size).map(i => i ->  List.fill(size)(math.random)).toMap
  val javaMap = (1 to size).map(i => int2Integer(i) -> List.fill(size)(double2Double(math.random)).asJava).toMap.asJava

  @Benchmark def _consumeScala(): Unit = consumeScala(inner, scalaMap)
  @Benchmark def _consumeJava(): Unit = consumeJava(inner, javaMap)
}

and

public class Inner {
    public void consume(java.util.Map<Integer, java.util.List<Double>> map) {
        map.forEach((k, v) -> v.stream().mapToDouble(Double::doubleValue).sum());
    }
}

where sbt "jmh:run -i 10 -wi 5 -f 2 -t 1 bench.So59827649" gives

[info] Benchmark                  Mode  Cnt   Score   Error  Units
[info] So59827649._consumeJava   thrpt   20  97.963 ± 0.980  ops/s
[info] So59827649._consumeScala  thrpt   20  96.350 ± 2.326  ops/s

Upvotes: 1

Alexey Romanov
Alexey Romanov

Reputation: 170909

asJava/asScala create simple wrappers. They allocate a single extra object and most methods will just delegate to the underlying collection. But the thing is, Java's collection classes are already very far from optimal for primitives; instead there are quite a few primitive collection libraries around. This 2015 article lists FastUtil, Goldman Sachs, HPPC, Koloboke, Trove. There are also Eclipse Collections and probably more I don't know about.

Upvotes: 1

Related Questions