javaalgorithmunit-testingbenchmarkingjmh

Reputation: 490

Interpretation of JMH results and algorithm complexity

I'm trying to prove algorithm complexity by using benchmark data. My algorithm to test is the binary search algorithm (stated complexity is O(log n)) and I want to use JMH library for benchmarking.

Here is test example:

public class BinarySearchTest {

private static SearchAlgorithm binaryIterative = new BinarySearchIterative();
private static SearchAlgorithm binaryRecursive = new BinarySearchRecursive();

@Test
public void runBenchmarks() throws Exception {
    Options options = new OptionsBuilder()
            .include(this.getClass().getName() + ".*")
            .mode(Mode.Throughput)
            .forks(1)
            .threads(1)
            .warmupIterations(0)
            .measurementIterations(1)
            .shouldFailOnError(true)
            .shouldDoGC(true)
            .build();

    new Runner(options).run();
}

@Benchmark
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public void binarySearchIterativeBenchmark(ExecutionPlan plan) {

    //given
    int size = randomPositiveIntLessThan(plan.arraySize);
    int[] array = generateUninterrupted(0, size);
    int target = randomPositiveIntLessThan(size);

    //when
    var result = binaryIterative.find(array, 0, array.length, target);

    //then
    assertTrue(result != -1);
}

This is class with algorithm implementation:

public class BinarySearchIterative implements SearchAlgorithm {

@Override
public int find(int[] array, int start, int end, int target) {

    if (end > array.length) {
        return -1;
    }

    int left = start;
    int right = end;

    while (left <= right) {
        int median = left + (right - left) / 2;
        if (array[median] == target) {
            return median;
        }
        if (array[median] > target) {
            right = median - 1;
        }
        if (array[median] < target) {
            left = median + 1;
        }
    }
    return -1;
}

I use a class annotated with @State to get size for arrays:

@State(Scope.Benchmark)
public class ExecutionPlan {
    @Param({"100000", "200000", "300000", "400000", "500000",
            "1000000", "2000000", "3000000", "4000000", "5000000",
           "10000000", "20000000", "30000000", "40000000", "50000000"})
    public int arraySize;

So I have next results:

BinarySearchTest.binarySearchIterativeBenchmark 100000 thrpt
31.602 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 200000 thrpt 14.520 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 300000 thrpt
9.004 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 400000 thrpt 6.896 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 500000 thrpt
5.333 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 1000000 thrpt 2.304 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 2000000 thrpt
0.790 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 3000000 thrpt 0.451 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 4000000 thrpt
0.330 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 5000000 thrpt 0.232 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 10000000 thrpt
0.135 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 20000000 thrpt 0.061 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 30000000 thrpt
0.039 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 40000000 thrpt 0.033 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 50000000 thrpt
0.025 ops/ms

But if I plot the graph score/arraysize I get not log(n) but rather 1/x graph. If I use Mode.AverageTime the graph is rather x^2.

Here is my graph for data provide above, y[ms/ops], x[arraysize]:

How can I get Operation units from JMH or tune my test?

Upvotes: 2

Answers (2)

Andrey

Reputation: 490

I think I've found the reasons for such behaviour. Here are my fixes:

Changed benchmark mode to Mode.AverageTime so now benchmark outputs average time in ms/op.
Switched to nanoseconds @OutputTimeUnit(TimeUnit.NANOSECONDS).
Added 1 Warmup Iteration to the benchmark.
Moved array generation from tests to ExecutionPlan class, and changed strategy for generation: now generate random integer values instead of continuous array of integers (thanks to @Spektre for this)
Changed @Setup level to Level.Trial, as per doc usage of Level.Invocation has some reasonable warnings.
Added more points (now 30).

Here is obtained data for iterative binary search:

and plotted graph with trend line:

Some points has big error, but the trend now tends to O(log n). I think the benchmark can be tuned for more precision using more iterations, warmups and forks.

Upvotes: 1

Spektre

Reputation: 51913

You are plotting and comparing wrong stuff.

so you got operations per 1ms ops_ms which is more or less measured time t in [ms] divided by number of operations m. For binary search of size n is:

m = ~log2(n)

To obtain complexity and or correct plot you need to plot measured time t versus size n however you are plotting ops_ms versus n...

So first we need to obtain the measured time t (top is time of single operation):

t = m*top
m = log2(n)
ops_ms = 1/top
--------------
top=1/ops_ms
t = log2(n)*top
--------------
t = log2(n)/ops_ms

so you need to plot t as y axis and n as x axis. However as you can see this way of measuring is worthless as you need to know what you measure in order to get m and even that is just an approximate... what is much better/accurate is using measured time directly as your ops/ms is messing all up.

When I use this measuring complexity on data like this:

const double l2=3.3219280948873623478703194294894;
double binsearch[]= // n[-],t[ms]
    {
      100000,l2*log(  100000.0)/31.602,
      200000,l2*log(  200000.0)/14.520,
      300000,l2*log(  300000.0)/ 9.004,
      400000,l2*log(  400000.0)/ 6.896,
      500000,l2*log(  500000.0)/ 5.333,
     1000000,l2*log( 1000000.0)/ 2.304,
     2000000,l2*log( 2000000.0)/ 0.790,
     3000000,l2*log( 3000000.0)/ 0.451,
     4000000,l2*log( 4000000.0)/ 0.330,
     5000000,l2*log( 5000000.0)/ 0.232,
    10000000,l2*log(10000000.0)/ 0.135,
    20000000,l2*log(20000000.0)/ 0.061,
    30000000,l2*log(30000000.0)/ 0.039,
    40000000,l2*log(40000000.0)/ 0.033,
    50000000,l2*log(50000000.0)/ 0.025,
      0,0.000
    };

it leads to this result:

binsearch O(n.log^4(n)) error = 0.398668

which is still too far of the expected log2(n) however much closer than other options. That implies additional stuff is balasting your ops/ms values which I expected ... You know you got JRE architecture and also the host architecture messing measurements with stuff like CACHE, prefetch pipeline-ing, etc and on top of all that your JMH might do stuff too (like averaging or "enhancing" the ops/ms value for some purpose) ...

In case ops_ms is actually binsearch/ms as one of the comments suggest then the time is computed by 1/ops_ms instead which might be true as the result is sligtly closer to O(log(n)) but still too far off:

//   time              O(n)          uncertainity
log2(n)/ops_ms    O(n.log^4(n))    error = 0.398668 // m ops / ms
    (n)/ops_ms    O(n^2.log^3(n))  error = 0.398668 // n ops / ms
    (1)/ops_ms    O(n.log^3(n))    error = 0.398668 // binsearch / ms

So my advice is to find a way to measure time directly instead of using ops/ms...

[edit1] my C++ implementation I tested on

int find(int *array,int size,int start,int end,int target)
    {
    if (end >= size) return -1;
    int left = start;
    int right = end;
    while (left <= right)
        {
        int median = left + (right - left) / 2;
        if (array[median] == target) return median;
        if (array[median] > target)  right = median - 1;
        if (array[median] < target)  left = median + 1;
        }
    return -1;
    }

usage:

const int n=50000000;
double binsearch[]= // n[-],t[ms]
    {
      100000,1.0,
      200000,1.0,
      300000,1.0,
      400000,1.0,
      500000,1.0,
     1000000,1.0,
     2000000,1.0,
     3000000,1.0,
     4000000,1.0,
     5000000,1.0,
    10000000,1.0,
    20000000,1.0,
    30000000,1.0,
    40000000,1.0,
    50000000,1.0,
      0,0.000
    };
int *dat=new int[n],i,s;
Randomize();
for (s=0,i=0;i<n;i++)
    {
    s+=1+Random(10);
    dat[i]=s;
    }
for (i=0;binsearch[i];)
    {
    s=binsearch[i]; i++;
    tbeg(); // star measuring of time
    find(dat,s,0,s-1,dat[Random(s)]);
    tend(); // end measuring of time
    binsearch[i]=performance_tms; i++; // store measured time
    }
delete[] dat;

This will generate PRNG int ascending array and test your find on it my bet is your data is not random at all as I described in my comment. When I apply this the result is as expected:

binsearch O(log(n)) error = 0.528393

So either your array and or target is not chosen correctly or even your measurement of time includes its generation which mess things up.

If I see it right your array generation is either O(n^2) or O(n.log(n)) opposing mine O(n) so if it would be included in measurements it would dominate the O(log(n)) ... the result:

(1)/ops_ms    O(n.log^3(n))    error = 0.398668 // binsearch / ms

suggest its the case and used generation is around O(n.log(n)) the log power discrepancy is just due to used computation architecture and time measurements imprecisions...

Upvotes: 1

Interpretation of JMH results and algorithm complexity

Answers (2)

Related Questions