Mantis
Mantis

Reputation: 124

Inconsistent JMH Throughput Values When Using paralell()

I'm new to JHM, to test it out I wrote the following simple method

@Benchmark
@OutputTimeUnit(TimeUnit.SECONDS)
public long sumPar(){
    return 
            LongStream.rangeClosed(1, LIMIT)
                      .parallel()
                      .sum();
}

The results of the test varied greatly

# JMH 1.10.3 (released 8 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: /usr/lib/jvm/java-8-oracle/jre/bin/java
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: testStreams.Bench.sumPar

# Run progress: 22.21% complete, ETA 00:02:33
# Fork: 1 of 1
# Warmup Iteration   1: 53.495 ops/s
# Warmup Iteration   2: 57.971 ops/s
# Warmup Iteration   3: 57.622 ops/s
# Warmup Iteration   4: 58.113 ops/s
# Warmup Iteration   5: 57.861 ops/s
Iteration   1: 50.414 ops/s
Iteration   2: 9.207 ops/s
Iteration   3: 9.176 ops/s
Iteration   4: 9.212 ops/s
Iteration   5: 9.175 ops/s

The same behaviour was observed after multiple runs. After reducing the warm up iterations I still saw a drop in ops/s after 5/6 iterations. Other Benchmarks in the test suite which used parallel operations performed consistently.

I'm new to JMH, I have a few questions.

Some specs

intel i7-4810MQ @ 2.80GHz
16GB RAM
Ubuntu 14.04 running on virtual box on windows host
virtual machine is alloacted 3 CPU cores
JDK 1.8.0_45
JMH 1.10.3

EDIT

Thanks for all the feedback, much appreciated.

I re-ran the benchmark this morning having left my laptop off all night. The inconsistent behaviour has completely disappeared. I upped the iterations to 1000 and still the ops/s remained consistent.

Checking CPU temp it was stable at 84 degrees.

I have not been able to reproduce this issue, next time I think my CPU may be overheating I'd like to re-run this benchmark and monitor CPU temp to see if this behaviour re-occurs.

Using -XX:+UnlockDiagnosticVMOptions -XX:+PrintCompilation -XX:+PrintInlining when running did show similar compilation patterns to the method in Erratic performance of Arrays.stream().map().sum(), however I do not think that was the root cause of this issue.


EDIT 2

Was able to reproduce and solved by adding -XX:MaxInlineLevel=12 flag.

DevBox:~/test$ java -jar target/benchmarks.jar testStreams.Bench -i 5 -wi 5 -f 1
# JMH 1.10.3 (released 8 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: /usr/lib/jvm/java-8-oracle/jre/bin/java
# VM options: <none>
# Warmup: 5 iterations, 1 s each
# Measurement: 5 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: testStreams.Bench.sumPar

# Run progress: 0.00% complete, ETA 00:00:30
# Fork: 1 of 1
# Warmup Iteration   1: 53.672 ops/s
# Warmup Iteration   2: 57.720 ops/s
# Warmup Iteration   3: 58.320 ops/s
# Warmup Iteration   4: 58.174 ops/s
# Warmup Iteration   5: 58.680 ops/s
Iteration   1: 49.810 ops/s
Iteration   2: 9.109 ops/s
Iteration   3: 9.427 ops/s
Iteration   4: 9.437 ops/s
Iteration   5: 9.436 ops/s

DevBox:~/test$ java -XX:MaxInlineLevel=12 -jar target/benchmarks.jar testStreams.Bench -i 1000 -wi 5 -f 1
# JMH 1.10.3 (released 8 days ago)
# VM version: JDK 1.8.0_45, VM 25.45-b02
# VM invoker: /usr/lib/jvm/java-8-oracle/jre/bin/java
# VM options: -XX:MaxInlineLevel=12
# Warmup: 5 iterations, 1 s each
# Measurement: 1000 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Benchmark: testStreams.Bench.sumPar

# Run progress: 0.00% complete, ETA 00:50:16
# Fork: 1 of 1
# Warmup Iteration   1: 53.888 ops/s
# Warmup Iteration   2: 58.328 ops/s
# Warmup Iteration   3: 58.468 ops/s
# Warmup Iteration   4: 58.455 ops/s
# Warmup Iteration   5: 57.937 ops/s
Iteration   1: 58.717 ops/s
Iteration   2: 59.494 ops/s
Iteration   3: 60.013 ops/s
Iteration   4: 59.506 ops/s
Iteration   5: 51.543 ops/s

Still haven't found root cause as to why I couldn't reproduce the issue when -XX:MaxInlineLevel=12 was omitted before. As far as I can see I was using the same settings. I may try run benchmark the again after my laptop has been left idle for a while, but for now am happy I gained some insight into JIT inlining.

Upvotes: 2

Views: 1021

Answers (2)

Marko Topolnik
Marko Topolnik

Reputation: 200296

I have all but confirmed that you are experiencing exactly the same behavior as described in this post: Erratic performance of Arrays.stream().map().sum()

If you run your test long enough (1000 iterations), you'll see that at some point performance is restored. The reason is an awkward inlining decision by the JIT compiler, which tears apart the hot loop code. Refer to the linked post for excellent coverage by Paul Sandoz.

Upvotes: 3

the8472
the8472

Reputation: 43150

intel i7-4810MQ

That's a mobile CPU as far as I can tell. Check if it's exhausting its thermal design envelope and throttling down the CPU clock to avoid overheating.

Upvotes: 1

Related Questions