user997112
user997112

Reputation: 30615

Measuring time according to CPU clock?

I have read in a few low latency technical papers that they measured timings via the CPU, as it is more accurate.

Usually in Java I would use:

System.nanoTime()

and in C++ I believe I once used a performance counter method which I found online that could do accuracy to nanoseconds. It used a LARGE_INTEGER, was assigned to the accuracy you wish to measure and then was passed by reference to QueryPerformanceCounter() and returned an answer divided by the frequency.

Is there any Java equivalent code to measure time according to the CPU, or would one have to use some sort of PInvoke?

EDIT:

https://www.google.co.uk/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CCYQFjAA&url=http%3A%2F%2Fdisruptor.googlecode.com%2Ffiles%2FDisruptor-1.0.pdf&ei=ImmQT5WQMOaW0QWW2sTwAQ&usg=AFQjCNEeGmYXzJa8huMdRGN2p4n8YH-jfg

To time at this level of precision it is necessary to use time stamp counters from the CPU. We chose CPUs with an invariant TSC because older processors suffer from changing frequency due to power saving and sleep states.

I'm interested in answers for Windows and Linux, but would appreciate if people could explain if their answer is specific to one.

Upvotes: 2

Views: 1555

Answers (2)

Kevin Welker
Kevin Welker

Reputation: 7937

Micro-benchmarking has several inherent variables that might be overlooked

  • Java effect of garbage collection
  • Java effects of JIT optimizations which take some time to "warm up"
  • Java target VM
  • Java VM settings (-Xnnnn settings, as well as client vs server mode)
  • target OS differences
  • target CPU differences
  • quiescence: how busy is the CPU multi-tasking other things in the background
  • overhead of the benchmark code itself

A tool like the Caliper Micro-benchmarking framework attempts to address some but not all of the above issues. I am not even certain of everything it is attempting to do. But at least the main obvious things it does is to attempt to warm up the JIT, run the benchmark code a fixed number of times and average over the iterations, and repeat that exercise several times until there is some acceptable tolerance difference between runs. It also captures and records the environment so that future benchmarks may compare apples to apples (instead of oranges). And it allows you to repeat and compare all of the above with different VM settings or program arguments easily, and compare the results of each.

That said, it is still a tricky peril-fraught endeavor to not misinterpret the results, or more likely to not let somebody else misinterpret the results.

EDIT (Addition) Actually the JIT can cut both ways. While you generally want the JIT to be warmed up, it can also optimize away things that you want to include as part of the benchmark. So you have to write your benchmark in such a way to anticipate and prevent things like loop invariants from being optimized away by forcing each loop to actually vary in the ways that are important/significant to what you are measuring.

Upvotes: 1

Peter Lawrey
Peter Lawrey

Reputation: 533500

System.nanoTime() can be have a fast nano-second resolution timer depending on the OS. On some OSes this is as fast at 20 ns.

In this library I have use RDTSC because RHEL 5.x is not one of those OS where its fast. :( https://github.com/peter-lawrey/Java-Thread-Affinity It takes less than 10 ns on a fast PC.

The problem with using the cpu counter is that its different on different sockets. If your program only runs on one socket, this is not a problem.

Upvotes: 2

Related Questions