Reputation: 490
I'm trying to prove algorithm complexity by using benchmark data. My algorithm to test is the binary search algorithm (stated complexity is O(log n)
) and I want to use JMH library for benchmarking.
Here is test example:
public class BinarySearchTest {
private static SearchAlgorithm binaryIterative = new BinarySearchIterative();
private static SearchAlgorithm binaryRecursive = new BinarySearchRecursive();
@Test
public void runBenchmarks() throws Exception {
Options options = new OptionsBuilder()
.include(this.getClass().getName() + ".*")
.mode(Mode.Throughput)
.forks(1)
.threads(1)
.warmupIterations(0)
.measurementIterations(1)
.shouldFailOnError(true)
.shouldDoGC(true)
.build();
new Runner(options).run();
}
@Benchmark
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public void binarySearchIterativeBenchmark(ExecutionPlan plan) {
//given
int size = randomPositiveIntLessThan(plan.arraySize);
int[] array = generateUninterrupted(0, size);
int target = randomPositiveIntLessThan(size);
//when
var result = binaryIterative.find(array, 0, array.length, target);
//then
assertTrue(result != -1);
}
This is class with algorithm implementation:
public class BinarySearchIterative implements SearchAlgorithm {
@Override
public int find(int[] array, int start, int end, int target) {
if (end > array.length) {
return -1;
}
int left = start;
int right = end;
while (left <= right) {
int median = left + (right - left) / 2;
if (array[median] == target) {
return median;
}
if (array[median] > target) {
right = median - 1;
}
if (array[median] < target) {
left = median + 1;
}
}
return -1;
}
I use a class annotated with @State
to get size for arrays:
@State(Scope.Benchmark)
public class ExecutionPlan {
@Param({"100000", "200000", "300000", "400000", "500000",
"1000000", "2000000", "3000000", "4000000", "5000000",
"10000000", "20000000", "30000000", "40000000", "50000000"})
public int arraySize;
So I have next results:
BinarySearchTest.binarySearchIterativeBenchmark 100000 thrpt
31.602 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 200000 thrpt 14.520 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 300000 thrpt
9.004 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 400000 thrpt 6.896 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 500000 thrpt
5.333 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 1000000 thrpt 2.304 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 2000000 thrpt
0.790 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 3000000 thrpt 0.451 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 4000000 thrpt
0.330 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 5000000 thrpt 0.232 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 10000000 thrpt
0.135 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 20000000 thrpt 0.061 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 30000000 thrpt
0.039 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 40000000 thrpt 0.033 ops/ms BinarySearchTest.binarySearchIterativeBenchmark 50000000 thrpt
0.025 ops/ms
But if I plot the graph score/arraysize I get not log(n) but rather 1/x graph. If I use Mode.AverageTime
the graph is rather x^2.
Here is my graph for data provide above, y[ms/ops], x[arraysize]:
How can I get Operation units from JMH or tune my test?
Upvotes: 2
Views: 725
Reputation: 490
I think I've found the reasons for such behaviour. Here are my fixes:
Mode.AverageTime
so now benchmark outputs average time in ms/op.@OutputTimeUnit(TimeUnit.NANOSECONDS)
.ExecutionPlan
class, and changed strategy for generation: now generate random integer values instead of continuous array of integers (thanks to @Spektre for this)@Setup
level to Level.Trial
, as per doc usage of Level.Invocation
has some reasonable warnings.Here is obtained data for iterative binary search:
and plotted graph with trend line:
Some points has big error, but the trend now tends to O(log n)
. I think the benchmark can be tuned for more precision using more iterations, warmups and forks.
Upvotes: 1
Reputation: 51913
You are plotting and comparing wrong stuff.
so you got operations per 1ms ops_ms
which is more or less measured time t
in [ms] divided by number of operations m
. For binary search of size n
is:
m = ~log2(n)
To obtain complexity and or correct plot you need to plot measured time t
versus size n
however you are plotting ops_ms
versus n
...
So first we need to obtain the measured time t
(top
is time of single operation):
t = m*top
m = log2(n)
ops_ms = 1/top
--------------
top=1/ops_ms
t = log2(n)*top
--------------
t = log2(n)/ops_ms
so you need to plot t
as y axis and n
as x axis. However as you can see this way of measuring is worthless as you need to know what you measure in order to get m
and even that is just an approximate... what is much better/accurate is using measured time directly as your ops/ms
is messing all up.
When I use this measuring complexity on data like this:
const double l2=3.3219280948873623478703194294894;
double binsearch[]= // n[-],t[ms]
{
100000,l2*log( 100000.0)/31.602,
200000,l2*log( 200000.0)/14.520,
300000,l2*log( 300000.0)/ 9.004,
400000,l2*log( 400000.0)/ 6.896,
500000,l2*log( 500000.0)/ 5.333,
1000000,l2*log( 1000000.0)/ 2.304,
2000000,l2*log( 2000000.0)/ 0.790,
3000000,l2*log( 3000000.0)/ 0.451,
4000000,l2*log( 4000000.0)/ 0.330,
5000000,l2*log( 5000000.0)/ 0.232,
10000000,l2*log(10000000.0)/ 0.135,
20000000,l2*log(20000000.0)/ 0.061,
30000000,l2*log(30000000.0)/ 0.039,
40000000,l2*log(40000000.0)/ 0.033,
50000000,l2*log(50000000.0)/ 0.025,
0,0.000
};
it leads to this result:
binsearch O(n.log^4(n)) error = 0.398668
which is still too far of the expected log2(n)
however much closer than other options. That implies additional stuff is balasting your ops/ms
values which I expected ... You know you got JRE architecture and also the host architecture messing measurements with stuff like CACHE, prefetch pipeline-ing, etc and on top of all that your JMH might do stuff too (like averaging or "enhancing" the ops/ms value for some purpose) ...
In case ops_ms
is actually binsearch/ms
as one of the comments suggest then the time is computed by 1/ops_ms
instead which might be true as the result is sligtly closer to O(log(n))
but still too far off:
// time O(n) uncertainity
log2(n)/ops_ms O(n.log^4(n)) error = 0.398668 // m ops / ms
(n)/ops_ms O(n^2.log^3(n)) error = 0.398668 // n ops / ms
(1)/ops_ms O(n.log^3(n)) error = 0.398668 // binsearch / ms
So my advice is to find a way to measure time directly instead of using ops/ms
...
[edit1] my C++ implementation I tested on
int find(int *array,int size,int start,int end,int target)
{
if (end >= size) return -1;
int left = start;
int right = end;
while (left <= right)
{
int median = left + (right - left) / 2;
if (array[median] == target) return median;
if (array[median] > target) right = median - 1;
if (array[median] < target) left = median + 1;
}
return -1;
}
usage:
const int n=50000000;
double binsearch[]= // n[-],t[ms]
{
100000,1.0,
200000,1.0,
300000,1.0,
400000,1.0,
500000,1.0,
1000000,1.0,
2000000,1.0,
3000000,1.0,
4000000,1.0,
5000000,1.0,
10000000,1.0,
20000000,1.0,
30000000,1.0,
40000000,1.0,
50000000,1.0,
0,0.000
};
int *dat=new int[n],i,s;
Randomize();
for (s=0,i=0;i<n;i++)
{
s+=1+Random(10);
dat[i]=s;
}
for (i=0;binsearch[i];)
{
s=binsearch[i]; i++;
tbeg(); // star measuring of time
find(dat,s,0,s-1,dat[Random(s)]);
tend(); // end measuring of time
binsearch[i]=performance_tms; i++; // store measured time
}
delete[] dat;
This will generate PRNG int ascending array and test your find
on it my bet is your data is not random at all as I described in my comment. When I apply this the result is as expected:
binsearch O(log(n)) error = 0.528393
So either your array and or target is not chosen correctly or even your measurement of time includes its generation which mess things up.
If I see it right your array generation is either O(n^2)
or O(n.log(n))
opposing mine O(n)
so if it would be included in measurements it would dominate the O(log(n))
... the result:
(1)/ops_ms O(n.log^3(n)) error = 0.398668 // binsearch / ms
suggest its the case and used generation is around O(n.log(n))
the log
power discrepancy is just due to used computation architecture and time measurements imprecisions...
Upvotes: 1