Reputation: 30058
When I write such code:
Stream<Integer> xs = Arrays.asList(1, 3, 5, 6, 7, 10).stream();
xs.map(x -> x * x).filter (x -> x > 15).forEach(System.out::println);
Java8 streams are divided into two sections; intermediate vs terminal operations, where the -AFAIK - actual action (under-the-hood iterations) is done in the terminal ops, while each intermediate ops appends its own -let me name it- Apply inner classes.
By this, there will be only one iteration over the list.
Sample code from JDK8:
@Override
@SuppressWarnings("unchecked")
public final <R> Stream<R> map(Function<? super P_OUT, ? extends R> mapper) {
Objects.requireNonNull(mapper);
return new StatelessOp<P_OUT, R>(this, StreamShape.REFERENCE,
StreamOpFlag.NOT_SORTED | StreamOpFlag.NOT_DISTINCT) {
@Override
Sink<P_OUT> opWrapSink(int flags, Sink<R> sink) {
return new Sink.ChainedReference<P_OUT, R>(sink) {
@Override
public void accept(P_OUT u) {
downstream.accept(mapper.apply(u));
}
};
}
};
}
When I write such code:
val xs = List(1, 3, 5, 6, 7, 10)
xs map (x => x * x) filter (x => x > 15) foreach (println)
I have been a while reading about it, but I never hear about such terms explicitly, moreover, the SDK implementation loops (either using recursion or regular loops) on each operation:
final override def map[B, That](f: A => B)(implicit bf: CanBuildFrom[List[A], B, That]): That = {
if (bf eq List.ReusableCBF) {
if (this eq Nil) Nil.asInstanceOf[That] else {
val h = new ::[B](f(head), Nil)
var t: ::[B] = h
var rest = tail
while (rest ne Nil) {
val nx = new ::(f(rest.head), Nil)
t.tl = nx
t = nx
rest = rest.tail
}
h.asInstanceOf[That]
}
}
else super.map(f)
}
Can we consider that Java implementation on the same thing would be much faster. (O(n) in Java
vs O(multiples of n) in Scala
)
Upvotes: 12
Views: 392
Reputation: 167891
Java 8 streams are less-feature-complete cousins of Scala iterators save for their ability to compute in parallel.
If you don't need parallel computation (and most times the overhead is not worth it--only for big expensive jobs do you want it), then you can get the same sort of processing with .iterator
in Scala (and then to[Vector]
or whatever you want at the end).
Java 8 streams are manually specialized (Scala Iterator
is not), so there are use cases where they are way faster, but it's not because of the constant factor due to recreating collections along the way--at least, not if you throw a .iterator
in there. (Without .iterator
, Scala collections evaluate eagerly by default; Java collections do not have that option.)
The Scala equivalent for the Java 8 code you wrote is the following:
val xsi = Array(1, 3, 5, 6, 7, 10).iterator
xsi.map(x => x*x).filter(_ > 15).foreach(println)
There is no difference in the number of collections created here with Scala vs. Java.
It would probably be a good idea to adopt the very clear "terminal operation" language for Scala's Iterator documentation. The Java 8 stream docs are great in that they make it exquisitely clear when you're building up the description of the work and when you're finally doing it.
Scala also provides a Stream
class that memoizes old work (so you don't have to compute it a second time if you're reusing it), and various views
so you don't have to recreate the processing chain each time you want to use it. For example, with your squaring, you could
val xsv = Array(1, 3, 5, 6, 7, 10).view
val xsq = xsv.map(x => x*x)
xsq.filter(_ > 15).foreach(println)
xsq.filter(_ < 5).foreach(println)
whereas with Java 8 streams xsq
would be exhausted after the first terminal operation.
So Scala actually does everything (save parallelism) that Java 8 streams do, and quite a bit more besides, and has for a long time.
Scala also has parallelizing collections, but the Java 8 implementation is sufficiently superior in performance that at this point I'd recommend using them first. And again, if manual specialization is your thing, Java 8 streams have it for Int, Double, and Long, and that's a big performance win. (Note: your example, using asList
, is not manually specialized.)
But if you just want to queue up operations and not have the overhead of building intermediate collections, Scala does it. You just have to ask.
Upvotes: 12
Reputation: 28036
The list in Scala is eager, which means that (as you say) there are multiple iterations over the list. There are (at least) two ways to get around this.
Using a view:
xs.view.map(x => x * x).filter(x => x > 15).force
Or by converting the list to a stream (which is lazy):
xs.toStream.map(x => x * x).filter(x => x > 15)
Upvotes: 9