Jhonny007
Jhonny007

Reputation: 1798

Difference in nanosecond precision between JVM and C++

When measuring time on the JVM with System.nanoTime() you get a higher precision than with the std::chrono::high_resolution_clock. How can that be and is there a cross platform way to get the same precision in C++ as on the JVM.

Examples:

Kotlin (JVM):

fun main(args: Array<String>) {
    for (i in 0..10)
        test() // warmup

    println("Average resolution: ${test()}ns")
}

fun test(): Double {
    val timeList = mutableListOf<Long>()

    for (i in 0 until 10_000_000) {
        val time = System.nanoTime()
        if (timeList.isEmpty() || time != timeList.last())
            timeList.add(time)
    }

    return timeList
            .mapIndexed { i, l -> if (i > 0) l - timeList[i - 1] else null }
            .filterNotNull()
            .average()
}

Output: Average resolution: 433.37ns

C++:

#include <iostream>
#include <chrono>
#include <numeric>
#include <vector>

int main() {
    using namespace std;
    using namespace chrono;

    vector<long long int> time_list;

    for(int i = 0; i < 10'000'000; ++i) {
        auto time = duration_cast<nanoseconds>(high_resolution_clock::now().time_since_epoch()).count();
        if(time_list.empty() || time != time_list[time_list.size() - 1])
            time_list.push_back(time);
    }

    adjacent_difference(time_list.begin(), time_list.end(), time_list.begin());
    auto result = accumulate(time_list.begin() + 1, time_list.end(), 0.0) / (time_list.size() - 1);

    printf("Average resolution: %.2fns", result);

    return 0;
}

Output: Average resolution: 15625657.89ns Edit: (MinGW g++) Edit: Output: Average resolution: 444.88ns (MSVC)

This was done on Windows, but on Linux I get similar results.

Edit:

Alright the original C++ was computed with MinGW and g++ after switching to MSVC I got on par results with the JVM (444.88ns).

Upvotes: 0

Views: 275

Answers (2)

Erwin Bolwidt
Erwin Bolwidt

Reputation: 31279

Your Java (Kotlin) example is not measuring nanosecond granularity; it is mostly measuring how long it takes to garbage collect a list of Long objects. (or expand the heap, or allocate the objects and object headers - if you only run the test once, it may attempt to garbage collect but it won't succeed for as long as the loop is running)

Java is pretty fast with memory allocation, usually faster than the standard memory allocators libraries for C/C++.

For C++, it's possible at least that a significant percentage of the perceived precision of the nanosecond clock comes from call push_back 10 million times on a vector, which involves a number of reallocations.

A better test would be (Kotlin, but the same can be done for C++) - there is no need to remember the timestamps in a list in order to calculate the average difference between them.

fun main(args: Array<String>) {
    for (i in 0 until 10) {
        runTest();
    }
}

fun runTest() {
    var lastTime = System.nanoTime()
    var count = 0;
    var total = 0L;
    for (i in 0 until 50_000_000) {
        val time = System.nanoTime()
        if (time > lastTime) {
            count++;
            total += time - lastTime;
            lastTime = time;
        }
    }

    val result = total / count;

    println("Average resolution: ${result}ns")
}

Note: this gives me a pretty consistent 32-35ns precision in Java, much better than the 45-200 ns that your original code gave me.

As for your C++ code, your original code running on my MacBookPro gives me 68-78ns (when run in a loop that runs it 10 times)

I've also removed the unnecessary vector from your code and then it gives a 50-51ns result, which gives a decent indication that the real granularity is 50ns.

The JVM does somewhat better than that with 32-35ns (38% better than 50ns), but the margin is not by far as big as what you mentioned.

Please try again and post the output with code that doesn't store the results in a list unnecessarily, as this greatly influences the results.

#include <iostream>
#include <chrono>
#include <numeric>
#include <vector>


int main1() {
    using namespace std;
    using namespace chrono;

    vector<long long int> time_list;

    long long total = 0;
    int count = 0;
    auto lastTime = duration_cast<nanoseconds>(high_resolution_clock::now().time_since_epoch()).count();
    for(int i = 0; i < 50000000; ++i) {
        auto time = duration_cast<nanoseconds>(high_resolution_clock::now().time_since_epoch()).count();
        if (time > lastTime) {
            count++;
            total += time - lastTime;
            lastTime = time; 
        }
    }

    long long result = total / count;

    printf("Average resolution: %.2lld ns\n", result);

    return 0;
}

int main() {
    for (int i = 0; i < 10; i++) {
        main1();
    }
}

Upvotes: 1

apetranzilla
apetranzilla

Reputation: 5949

The resolution in c++ is platform-dependent, but you may be able to get better accuracy using platform-specific calls (e.g. clock_gettime from time.h on Linux)

Upvotes: 0

Related Questions