dantiston
dantiston

Reputation: 5371

Collection.toArray() vs Collection.stream().toArray()

Consider the following code:

List<String> myList = Arrays.asList(1, 2, 3);
String[] myArray1 = myList.toArray(new String[myList.size()]);
String[] myArray2 = myList.stream().toArray(String[]::new);
assert Arrays.equals(myArray1, myArray2);

It seems to me that using a stream is much simpler.

Therefore, I tested the speed of each.

List<String> myList = Arrays.asList("1", "2", "3");
double start;

start = System.currentTimeMillis();
for (int i = 0; i < 10_000_000; i++) {
    String[] myArray1 = myList.toArray(new String[myList.size()]);
    assert myArray1.length == 3;
}
System.out.println(System.currentTimeMillis() - start);

start = System.currentTimeMillis();
for (int i = 0; i < 10_000_000; i++) {
    String[] myArray2 = myList.stream().toArray(String[]::new);
    assert myArray2.length == 3;
}
System.out.println(System.currentTimeMillis() - start);

The result is that using the stream is about four times slower. On my machine, 816ms (stream) vs 187ms (no stream). I also tried switching the timing statements around (myArray2 prior to myArray1), which didn't effect the results much. Why is this so much slower? Is creating a Stream so computationally intensive?

I followed @Holger's advice and studied up a bit (surely not enough) on JVM testing, reading this post, this article, this article, and using JMH.


Results (via JMH):

private static final List<String> myList = IntStream.range(1, 1000).mapToObj(String::valueOf).collect(Collectors.toList());

@Benchmark
public void testMethod() {
    String[] myArray = myArrayList.stream().toArray(String[]::new);
}

StreamToArrayArrayListBenchmark.testMethod avgt 5 2846.346 ± 32.500 ns/op

private static final List<String> myList = IntStream.range(1, 1000).mapToObj(String::valueOf).collect(Collectors.toList());

@Benchmark
public void testMethod() {
    String[] myArray = myArrayList.toArray(new String[0]);
}

ToArrayEmptyArrayListBenchmark.testMethod avgt 5 1417.474 ± 20.725 ns/op

private static final List<String> myList = IntStream.range(1, 1000).mapToObj(String::valueOf).collect(Collectors.toList());

@Benchmark
public void testMethod() {
    String[] myArray = myArrayList.toArray(new String[myList.size()]);
}

ToArraySizedArrayListBenchmark.testMethod avgt 5 1853.622 ± 178.351 ns/op


private static final List<String> myList = new LinkedList<>(IntStream.range(1, 1000).mapToObj(String::valueOf).collect(Collectors.toList()));

@Benchmark
public void testMethod() {
    String[] myArray = myArrayList.stream().toArray(String[]::new);
}

StreamToArrayLinkedListBenchmark.testMethod avgt 5 4152.003 ± 59.281 ns/op

private static final List<String> myList = new LinkedList<>(IntStream.range(1, 1000).mapToObj(String::valueOf).collect(Collectors.toList()));

@Benchmark
public void testMethod() {
    String[] myArray = myArrayList.toArray(new String[0]);
}

ToArrayEmptyLinkedListBenchmark.testMethod avgt 5 4089.550 ± 29.880 ns/op

private static final List<String> myList = new LinkedList<>(IntStream.range(1, 1000).mapToObj(String::valueOf).collect(Collectors.toList()));

@Benchmark
public void testMethod() {
    String[] myArray = myArrayList.toArray(new String[myList.size()]);
}

ToArraySizedArrayListBenchmark.testMethod avgt 5 4115.557 ± 93.964 ns/op


To summarize:

              | ArrayList | LinkedList
stream        | 2846      | 4152
toArray sized | 1853      | 4115
toArray empty | 1417      | 4089

Using JMH (possibly naively), I'm still seeing that ArrayList::toArray is about twice as fast as Stream::toArray. However, this does seem to be because of the ability of the ArrayList to just do an array copy, as @Andreas pointed out because when the source is a LinkedList the results are about equal.

It's definitely good to know about myList.toArray(new String[0]).

Upvotes: 2

Views: 3285

Answers (2)

Philipp Cla&#223;en
Philipp Cla&#223;en

Reputation: 43950

Under the hood, streams are much more complicated than plain arrays. Compilers will get better, but currently, sequential for loops should be faster than stream operations.

This article has some background about stream pipelines, which are used to implement streams. It can help to understand the complexity behind it.

The advantage of streams is that the code can be clearer and it is easier to parallelize it.

Upvotes: 0

Andreas
Andreas

Reputation: 159086

Arrays.asList() creates a fixed-size List that is directly backed by the varargs array parameter. Javadoc even says so:

Returns a fixed-size list backed by the specified array.

Its implementation of toArray() is a simple System.arraycopy(). Very fast.

On the other hand, when you do myList.stream().toArray(String[]::new), the size is not known, so the Stream.toArray() method has to consume the stream, collect all the values, then create the array and copy the values into the array. That is a lot slower, and requires a lot more memory.

In short, it's a waste of resources.

If you want simpler, just don't give the array size. It is still way faster and less memory intensive than using Streams:

String[] myArray1 = myList.toArray(new String[0]);

Upvotes: 9

Related Questions