Elsinor
Elsinor

Reputation: 200

Do Go testing.B benchmarks prevent unwanted optimizations?

I've recently started learning Go and I'm trying to implement a map that can be used concurrently by multiple groutines. I want to be able to compare my implementation to a simple sync.Mutex-protected map, or to something like this: https://github.com/streamrail/concurrent-map/blob/master/concurrent_map.go

From using Google Caliper, I assume that a naive approach for benchmarking would allow many unwanted optimizations to trash the actual results. Are the benchmarks that use testing.B employing some of the techniques to avoid that (after all both Go and Caliper are Google projects)? If yes, are they known? If not, what's the best way to microbenchmark in Go?

Upvotes: 8

Views: 3099

Answers (3)

John S Perayil
John S Perayil

Reputation: 6345

Converting my comment to an answer.

To be completely accurate, any benchmark should be careful to avoid compiler optimisations eliminating the function under test and artificially lowering the run time of the benchmark.

var result int

func BenchmarkFibComplete(b *testing.B) {
        var r int
        for n := 0; n < b.N; n++ {
                // always record the result of Fib to prevent
                // the compiler eliminating the function call.
                r = Fib(10)
        }
        // always store the result to a package level variable
        // so the compiler cannot eliminate the Benchmark itself.
        result = r
}

Source

The following page can also be useful.

Compiler And Runtime Optimizations

Another interesting read is

One other interesting flag is -N, which will disable the optimisation pass in the compiler.

Source1 Source2

I'm not a 100% sure but the following should disable optimisations ? Someone with more experience needs to confirm it.

go test -gcflags=-N -bench=.

Upvotes: 4

Rob Napier
Rob Napier

Reputation: 299355

@David Budworth gives a lot of good info, and I agree regarding Go vs Java, but there still are many things you have to consider in microbenchmarking. Most of them boil down to "how closely does this match your use case?" For example, different concurrency patterns perform very differently under contention. Do you expect multiple simultaneous writers to be common? Single writer, many readers? Many readers, rare writing? Single-access? Different producers/consumers accessing different parts of the map? A scheme that performs beautifully in your benchmark may be rubbish for other use cases.

Similarly you may discover that your scheme is or isn't very dependent on locality of reference. Some approaches perform very differently if the same values are being read over and over again (because they stay in the on-CPU caches). This is very common in microbenchmarks, but may not be very indicative of your intended use case.

This isn't to say microbenchmarks are useless, only that they are very often almost useless :D … at least for arriving at general conclusions. If you're building this for a particular project, just make sure that you're testing against realistic data and patterns that match your use case (and ideally just turn this into a real benchmark for your program, rather than a "microbenchmark" of the data structure). If you're building this for general use, you'll need to make sure you're benchmarking against a wide range of use cases before coming to too many conclusions on whether it is substantially better.

And if it's just educational, awesome. Learning why a particular scheme works better or worse in various situations is great experience. Just don't push your findings past your evidence.

Upvotes: 2

David Budworth
David Budworth

Reputation: 11626

In Java, micro benchmarks are harder to do due to how the Hotspot compiler works. If you simply just run the same code over and over, you will often find it gets faster which throws off your averages. To compensate, Caliper has to do some warmup runs and other tricks to try to get a stable benchmark.

In Go, things are statically compiled. There is no runtime Hotspot like system. It doesn't really have to do any tricks to get a good timing.

The testing.B functionality should have no impact on your code's performance, so you shouldn't have to do anything special.

Upvotes: 2

Related Questions