Orangeberry
Orangeberry

Reputation: 51

Google/benchmark inconsistent results

I'm new to using Google Benchmark and receive different results running the same benchmark (below), which retrieves the local time using C++, when running the code locally vs on Quick-Bench.com. Both times I used GCC 8.2 and -O3.

Why do the results vary dramatically between running locally vs on quick-bench.com? Which is correct?

#include <benchmark/benchmark.h>
#include <ctime>      
#include <sys/time.h> 
#include <chrono>     


static void BM_ctime(benchmark::State& state) {
  unsigned long long count = 0;

  for (auto _ : state) {
    std::time_t sec = std::time(0);  

    benchmark::DoNotOptimize(count += sec);
  }
}

BENCHMARK(BM_ctime);


static void BM_sysTime(benchmark::State& state) {
  unsigned long long count = 0;

  for (auto _ : state) {
    unsigned long sec = time(NULL);

    benchmark::DoNotOptimize(count += sec);
  }
}

BENCHMARK(BM_sysTime);


static void BM_chronoMilliseconds(benchmark::State& state) {
  unsigned long long count = 0;

  for (auto _ : state) {
    unsigned long long ms = std::chrono::duration_cast<std::chrono::milliseconds>(
      std::chrono::system_clock::now().time_since_epoch()
    ).count();

    benchmark::DoNotOptimize(count += ms);
  }
}

BENCHMARK(BM_chronoMilliseconds);

static void BM_chronoSececonds(benchmark::State& state) {
  unsigned long long count = 0;

  for (auto _ : state) {
    unsigned long long sec = std::chrono::duration_cast<std::chrono::seconds>(
      std::chrono::system_clock::now().time_since_epoch()
    ).count();

    benchmark::DoNotOptimize(count += sec);
  }
}

BENCHMARK(BM_chronoSececonds);

Locally the following results are produced:

-------------------------------------------------------------
Benchmark                      Time           CPU Iterations
-------------------------------------------------------------
BM_ctime                     183 ns        175 ns    4082013
BM_sysTime                   197 ns        179 ns    4004829
BM_chronoMilliseconds         37 ns         36 ns   19092506
BM_chronoSececonds            37 ns         36 ns   19057991

QuickBench results:

Upvotes: 1

Views: 1564

Answers (2)

talekeDskobeDa
talekeDskobeDa

Reputation: 382

I just run your example on my machine and I see the below result:

----------------------------------------------------------------
Benchmark                      Time             CPU   Iterations
----------------------------------------------------------------
BM_ctime                    3.26 ns         3.25 ns    215110555
BM_sysTime                  3.26 ns         3.25 ns    215154791
BM_chronoMilliseconds       2502 ns         2502 ns       279856
BM_chronoSececonds          2502 ns         2501 ns       279854

Assuming that a NOP instruction takes 1 clock cycle, which is 0.5 ns on my system, the ratio CPU time / NoOp time is around 5000.

However, I should not be really concerned because that is not what bench-marking is meant for at least for me. It doesn't make sense to compare values on my system with the values from Quick bench. Rather, I use benchmark values to compare different implementations or algorithms on the same machine, eliminating such doubts.

Upvotes: 0

Karl
Karl

Reputation: 181

Benchmark results are platform/architecture/machine dependent. It isn't even practical to assume your benchmarks will always be the same when you are running them on the same machine, things like temperature, performance scaling options, wear and tear, etc., can affect performance.

Upvotes: 1

Related Questions