Reputation: 30615
I have read in a few low latency technical papers that they measured timings via the CPU, as it is more accurate.
Usually in Java I would use:
System.nanoTime()
and in C++ I believe I once used a performance counter method which I found online that could do accuracy to nanoseconds. It used a LARGE_INTEGER, was assigned to the accuracy you wish to measure and then was passed by reference to QueryPerformanceCounter() and returned an answer divided by the frequency.
Is there any Java equivalent code to measure time according to the CPU, or would one have to use some sort of PInvoke?
EDIT:
To time at this level of precision it is necessary to use time stamp counters from the CPU. We chose CPUs with an invariant TSC because older processors suffer from changing frequency due to power saving and sleep states.
I'm interested in answers for Windows and Linux, but would appreciate if people could explain if their answer is specific to one.
Upvotes: 2
Views: 1555
Reputation: 7937
Micro-benchmarking has several inherent variables that might be overlooked
A tool like the Caliper Micro-benchmarking framework attempts to address some but not all of the above issues. I am not even certain of everything it is attempting to do. But at least the main obvious things it does is to attempt to warm up the JIT, run the benchmark code a fixed number of times and average over the iterations, and repeat that exercise several times until there is some acceptable tolerance difference between runs. It also captures and records the environment so that future benchmarks may compare apples to apples (instead of oranges). And it allows you to repeat and compare all of the above with different VM settings or program arguments easily, and compare the results of each.
That said, it is still a tricky peril-fraught endeavor to not misinterpret the results, or more likely to not let somebody else misinterpret the results.
EDIT (Addition) Actually the JIT can cut both ways. While you generally want the JIT to be warmed up, it can also optimize away things that you want to include as part of the benchmark. So you have to write your benchmark in such a way to anticipate and prevent things like loop invariants from being optimized away by forcing each loop to actually vary in the ways that are important/significant to what you are measuring.
Upvotes: 1
Reputation: 533500
System.nanoTime() can be have a fast nano-second resolution timer depending on the OS. On some OSes this is as fast at 20 ns.
In this library I have use RDTSC because RHEL 5.x is not one of those OS where its fast. :( https://github.com/peter-lawrey/Java-Thread-Affinity It takes less than 10 ns on a fast PC.
The problem with using the cpu counter is that its different on different sockets. If your program only runs on one socket, this is not a problem.
Upvotes: 2