nantitv
nantitv

Reputation: 3733

java stream performace for finding maximum element form a list

I wrote a simple program to compare to performance of stream for finding maximum form list of integer. Surprisingly I found that the performance of ' stream way' 1/10 of 'usual way'. Am I doing something wrong? Is there any condition on which Stream way will not be efficient? Could anyone have a nice explanation for this behavior?

"stream way" took 80 milliseconds "usual way" took 15 milli seconds Please find the code below

public class Performance {

public static void main(String[] args) {

    ArrayList<Integer> a = new ArrayList<Integer>();
    Random randomGenerator = new Random();
    for (int i=0;i<40000;i++){
        a.add(randomGenerator.nextInt(40000));
    }
    long start_s = System.currentTimeMillis( );

            Optional<Integer> m1 = a.stream().max(Integer::compare);

    long diff_s = System.currentTimeMillis( ) - start_s;
    System.out.println(diff_s);


    int e = a.size();
    Integer m = Integer.MIN_VALUE;

    long start = System.currentTimeMillis( );
    for(int i=0; i < e; i++) 
      if(a.get(i) > m) m = a.get(i);

    long diff = System.currentTimeMillis( ) - start;
    System.out.println(diff);


}

}

Upvotes: 0

Views: 1257

Answers (3)

Tagir Valeev
Tagir Valeev

Reputation: 100249

Yes, Streams are slower for such simple operations. But your numbers are completely unrelated. If you think that 15 milliseconds is satisfactory time for your task, then there are good news: after warm-up stream code can solve this problem in like 0.1-0.2 milliseconds, which is 70-150 times faster.

Here's quick-and-dirty benchmark:

import java.util.concurrent.TimeUnit;
import java.util.*;
import java.util.stream.*;

import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.annotations.*;

@Warmup(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(3)
@State(Scope.Benchmark)
public class StreamTest {
    // Stream API is very nice to get random data for tests!
    List<Integer> a = new Random().ints(40000, 0, 40000).boxed()
                                  .collect(Collectors.toList());

    @Benchmark
    public Integer streamList() {
        return a.stream().max(Integer::compare).orElse(Integer.MIN_VALUE);
    }

    @Benchmark
    public Integer simpleList() {
        int e = a.size();
        Integer m = Integer.MIN_VALUE;
        for(int i=0; i < e; i++) 
            if(a.get(i) > m) m = a.get(i);
        return m;
    }
}

The results are:

Benchmark               Mode  Cnt    Score    Error  Units
StreamTest.simpleList   avgt   30   38.241 ±  0.434  us/op
StreamTest.streamList   avgt   30  215.425 ± 32.871  us/op

Here's microseconds. So the Stream version is actually much faster than your test. Nevertheless the simple version is even more faster. So if you were fine with 15 ms, you can use any of these two versions you like: both will perform much faster.

If you want to get the best possible performance no matter what, you should get rid of boxed Integer objects and work with primitive array:

int[] b = new Random().ints(40000, 0, 40000).toArray();

@Benchmark
public int streamArray() {
    return Arrays.stream(b).max().orElse(Integer.MIN_VALUE);
}

@Benchmark
public int simpleArray() {
    int e = b.length;
    int m = Integer.MIN_VALUE;
    for(int i=0; i < e; i++) 
        if(b[i] > m) m = b[i];
    return m;
}

Both versions are faster now:

Benchmark               Mode  Cnt    Score    Error  Units
StreamTest.simpleArray  avgt   30   10.132 ±  0.193  us/op
StreamTest.streamArray  avgt   30  167.435 ±  1.155  us/op

Actually the stream version result may vary greatly as it involves many intermediate methods which are JIT-compiled in different time, so the speed may change in any direction after some iterations.

By the way your original problem can be solved by good old Collections.max method without Stream API like this:

Integer max = Collections.max(a);

In general you should avoid testing the artificial code which does not solve real problems. With artificial code you will get the artificial results which generally say nothing about the API performance in real conditions.

Upvotes: 5

Sharon Ben Asher
Sharon Ben Asher

Reputation: 14348

The immediate difference that I see is that the stream way uses Integer::compare which might require more autoboxing etc. vs. an operator in the loop. perhaps you can call Integer::compare in the loop to see if this is the reason?

EDIT: following the advice from Nicholas Robinson, I wrote a new version of the test. It uses 400K sized list (the original one yielded zero diff results), it uses Integer.compare in both cases and runs only one of them in each invocation (I alternate between the two methods):

static List<Integer> a = new ArrayList<Integer>();

public static void main(String[] args)
{

    Random randomGenerator = new Random();
    for (int i = 0; i < 400000; i++) {
        a.add(randomGenerator.nextInt(400000));
    }

    long start = System.currentTimeMillis();
    //Integer max = checkLoop();
    Integer max = checkStream();
    long diff = System.currentTimeMillis() - start;
    System.out.println("max " + max + " diff " + diff);
}

static Integer checkStream()
{
    Optional<Integer> max = a.stream().max(Integer::compare);
    return max.get();
}

static Integer checkLoop()
{
    int e = a.size();
    Integer max = Integer.MIN_VALUE;
    for (int i = 0; i < e; i++) {
        if (Integer.compare(a.get(i), max) > 0) max = a.get(i); 
    }
    return max;
}

The results for loop: max 399999 diff 10
The results for stream: max 399999 diff 40 (and sometimes I got 50)

Upvotes: 1

Nicholas Robinson
Nicholas Robinson

Reputation: 1419

In Java 8 they have been putting a lot of effort into making use of concurrent processes with the new lambdas. You will find the stream to be so much faster because the list is being processed concurrently in the most efficient way possible where as the usual way is running through the list sequentially.

Because the lambda are static this makes threading easier, however when you are accessing something line your hard drive (reading in a file line by line) you will probably find the stream wont be as efficient because the hard drive can only access info.

[UPDATE] The reason your stream took so much longer than the normal way is because you run in first. The JRE is constantly trying to optimize the performance so there will be a cache set up with the usual way. If you run the usual way before the stream way you should get opposing results. I would recommend running the tests in different mains for the best results.

Upvotes: 0

Related Questions