Reputation: 8128
I have benchmarked several ways to fold a large array of primitives ("direct" and with iterators), and the results are disappointing. (Yes, I have done warmup, intermediate GC and many run passes, running JVM in server mode and scalac
optimisations are enabled (and debugging info is disabled)).
I think code is too big to post here, so here is link: http://pastebin.com/18dWWBM4 The only method there that runs nearly as good as plain old imperative loop is this not-so-generic hand-written function:
@inline def array_foldl[@specialized A, @specialized B](init: B)(src: Array[A])(fun: (B, A) => B) = {
var res = init
var i = 0
var len = src.length
while (i < len) {
res = fun(res, src(i))
i += 1
}
res
}
Other visually nice methods are complete outsiders. Also, using iterator abstractions fails in all cases, with hand-written parody to the standart Iterator called SpecializedIterator
being slightly faster. So what's the problem? Can it be improved somehow? Is there a way to make "fast" iterator, or there is a big problem in the principle itself?
Thanks for attention.
Upvotes: 2
Views: 371
Reputation: 167901
The problem is boxing. It takes a lot longer to create an object than to add two numbers, but if you use generic (non-specialized) folds, you have to create an object every time. The problem with just specializing everything is that you make the entire library 100x larger since you need every combination of two primitive parameters (including with non-primitives), plus the original no-type-parameter version. (100x because there are 8 primitives plus Unit
plus AnyRef
/non-specialized T
.) This is untenable, and since there is no readily available alternate solution, the collections are presently unspecialized.
Also, specialization itself is relatively new and thus still has some deficits in its implementation. In particular, you seem to have hit one with SpecializedIterator
: the function in foreach
doesn't end up specialized (I collapsed the trait/object thing into a single class to make it easier to track down):
public class Main$SpecializedArrayIterator$mcJ$sp extends Main$SpecializedArrayIterator{
public final void foreach$mcJ$sp(scala.Function1);
Code:
0: aload_0
1: invokevirtual #39; //Method Main$SpecializedArrayIterator.hasNext:()Z
4: ifeq 24
7: aload_1
8: aload_0
9: invokevirtual #14; //Method next$mcJ$sp:()J
12: invokestatic #45; //Method scala/runtime/BoxesRunTime.boxToLong:(J)Ljava/lang/Long;
15: invokeinterface #51, 2; //InterfaceMethod scala/Function1.apply:(Ljava/lang/Object;)Ljava/lang/Object;
20: pop
21: goto 0
24: return
See the box at line 12, followed by a call to un-specialized Function1? Oops. (The tuple (A, (A,A) => A)
used in sum
also messes up specialization.) An implementation like this is full speed:
class SpecializedArrayIterator[@specialized A](src: Array[A]) {
var i = 0
val l = src.length
@inline final def hasNext: Boolean = i < l
@inline final def next(): A = { val res = src(i); i += 1; res }
@inline final def foldLeft[@specialized B](z: B)(op: (B, A) => B): B = {
var result = z
while (hasNext) result = op(result,next)
result
}
}
...
measure((new SpecializedArrayIterator[Long](test)).foldLeft(0L)(_ + _))
...
With results like so:
Launched 51298 times in 2000 milliseconds, ratio = 25.649 // New impl
Launched 51614 times in 2000 milliseconds, ratio = 25.807 // While loop
Upvotes: 4