Reputation: 1890
I was trying to observe the effects of CPU cache spatial locality by benchmarking sequential/random reads to an array with JMH. Interestingly, the results are almost the same.
So I wonder, is this the correct JMH approach?
Below is the test class I have used
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(MyBenchmark.N)
public class MyBenchmark {
/*
* # JMH version: 1.21
* # VM version: JDK 1.8.0_91, Java HotSpot(TM) 64-Bit Server VM, 25.91-b15
* # VM invoker: D:\jdk1.8.0_91\jre\bin\java.exe
* # VM options: <none>
* # Warmup: 5 iterations, 10 s each
* # Measurement: 5 iterations, 10 s each
* # Timeout: 10 min per iteration
* # Threads: 1 thread, will synchronize iterations
* # Benchmark mode: Average time, time/op
*
* Benchmark Mode Cnt Score Error Units
* MyBenchmark.randomAccess avgt 25 7,930 ± 0,378 ns/op
* MyBenchmark.serialAccess avgt 25 7,721 ± 0,081 ns/op
*/
static final int N = 1_000;
@State(Scope.Benchmark)
public static class Common {
int[] data = new int[N];
int[] serialAccessOrder = new int[N];
int[] randomAccessOrder = new int[N];
public Common() {
Random r = new Random(11234);
for (int i=0; i<N; i++) {
data[i] = r.nextInt(N);
serialAccessOrder[i] = i;
randomAccessOrder[i] = data[i];
}
}
}
@Benchmark
public void serialAccess(Blackhole bh, Common common) {
for (int i=0; i<N; i++) {
bh.consume(common.data[common.serialAccessOrder[i]]);
}
}
@Benchmark
public void randomAccess(Blackhole bh, Common common) {
for (int i=0; i<N; i++) {
bh.consume(common.data[common.randomAccessOrder[i]]);
}
}
}
Update: Turns out N was too small (1_000 * 4 bytes/int ~= 4KB) most likely the entire array was cached. Increasing N to 1_000_000 yields more intuitive results:
Benchmark Mode Cnt Score Error Units
MyBenchmark.randomAccess avgt 25 20,426 ± 0,678 ns/op
MyBenchmark.serialAccess avgt 25 6,762 ± 0,252 ns/op
Upvotes: 2
Views: 229