James Oswald
James Oswald

Reputation: 603

Why does std::chrono say my function took zero nanoseconds to execute?

I am working on a project where I implement two popular MST algorithms in C++ and then print how long each one takes to execute. Please ignore the actual algorithms, I have already tested them and am only interested in getting accurate measurements of how long they take.

void Graph::krushkalMST(bool e){
    size_t s2 = size * size;
    typedef struct{uint loc; uint val;} wcType; //struct used for storing a copy of the weights values to be sorted, with original locations
    wcType* weightsCopy = new wcType[s2];       //copy of the weights which will be sorted.
    for(int i = 0; i < s2; i++){
        weightsCopy[i].loc = i;
        weightsCopy[i].val = weights[i];
    }
    std::vector<uint> T(0);                                     //List of edges in the MST
    auto start = std::chrono::high_resolution_clock::now();     //time the program was started
    typedef int (*cmpType)(const void*, const void*);           //comparison function type
    static cmpType cmp = [](const void* ua, const void* ub){    //Compare function used by the sort as a C++ lambda
        uint a = ((wcType*)ua)->val, b = ((wcType*)ub)->val;
        return (a == b) ? 0 : (a == NULLEDGE) ? 1 : (b == NULLEDGE) ? -1 : (a < b) ? -1 : 1;
    };
    std::qsort((void*)weightsCopy, s2, sizeof(wcType), cmp);    //sort edges into ascending order using a quick sort (supposedly quick sort)
    uint* componentRefs = new uint[size];                       //maps nodes to what component they currently belong to
    std::vector<std::vector<uint>> components(size);            //vector of components, each component is a vector of nodes;
    for(int i = 0; i < size; i++){
        //unOptimize(components);
        components[i] = std::vector<uint>({(uint)i});
        componentRefs[i] = i;
    }
    for(int wcIndex = 0; components.size() >= 2 ; wcIndex++){
        uint i = getI(weightsCopy[wcIndex].loc), j = getJ(weightsCopy[wcIndex].loc); //get pair of nodes with the smallest edge
        uint ci = componentRefs[i], cj = componentRefs[j];      //locations of nodes i and j
        if(ci != cj){
            T.push_back(weightsCopy[wcIndex].loc);              //push the edge into T
            for(int k = 0; k < components[cj].size(); k++)      //move each member in j's component to i's component
                components[ci].push_back(components[cj][k]);
            for(int k = 0; k < components[cj].size(); k++)      //copy this change into the reference locations
                componentRefs[components[cj][k]] = ci;
            components.erase(components.begin() + cj);          //delete j's component
            for(int k = 0; k < size; k++)                                   
                if(componentRefs[k] >= cj)
                    componentRefs[k]--;
        }
    }
    auto end = std::chrono::high_resolution_clock::now();
    uint time = std::chrono::duration_cast<std::chrono::nanoseconds>(end-start).count();
    std::cout<<"\nMST found my krushkal's Algorithm:\n";
    printData(time, T, e);
    delete[] weightsCopy;
    delete[] componentRefs;
}

void Graph::primMST(bool e){
    std::vector<uint> T(0);                                     //List of edges in the MST
    auto start = std::chrono::high_resolution_clock::now();     //Start calculating the time the algorithm takes
    bool* visited = new bool[size];                             //Maps each node to a visited value
    visited[0] = true;
    for(int i = 1; i < size; i++)
        visited[i] = false;
    for(uint numVisited = 1; numVisited < size; numVisited++){
        uint index = 0;                                         //index of the smallest cost edge to unvisited node
        uint minCost = std::numeric_limits<uint>::max();                //cost of the smallest edge filling those conditions
        for(int i = 0; i < size; i++){
            if(visited[i]){
                for(int j = 0; j < size; j++){
                    if(!visited[j]){
                        uint curIndex = i * size + j, weight = dweights[curIndex];
                        if(weight != NULLEDGE && weight < minCost){
                            index = curIndex;
                            minCost = weight;
                        }
                    }
                }
            }
        }
        T.push_back(index);
        visited[getI(index)] = true;
    }
    auto end = std::chrono::high_resolution_clock::now();
    uint time = std::chrono::duration_cast<std::chrono::microseconds>(end-start).count();
    std::cout<<"\nMST found my Prim's Algorithm:\n";
    printData(time, T, e);
    delete[] visited;
}

I initially used clock() from <ctime> to try and get an accurate measurement of how long this would take, my largest test file has a graph of 40 nodes with 780 edges (sufficiently large enough to warrant some compute time), and even then on a slow computer using g++ with -O0 i would get either 0 or 1 milliseconds. On my desktop I was only ever able to get 0 ms, however as I need a more accurate way to distinguish time between test cases I decided I would try for the high_resolution_clock provided by the <chrono> library.

This is where the real trouble began, I would (and still) consistently get that the program took 0 nanoseconds to execute. Output of my program for the hardest test case In my search for a solution I came across multiple questions that deal with similar issues, most of which state that <chrono> is system dependent and you're unlikely to actually be able to get nanosecond or even microsecond values. Never the less, I tried using std::chrono::microsecond only to still consistently get 0. Eventually I found what I thought was someone who was having the same problem as me:

counting duration with std::chrono gives 0 nanosecond when it should take long

However, this is clearly a problem of an overactive optimizer which has deleted an unnecessary piece of code, whereas in my case the end result always depends on the results for series of complex loops which must be executed in full. I am on Windows 10, compiling with GCC using -O0.

My best hypothesis is I'm doing something wrong or that windows doesn't support anything smaller then milliseconds while using std::chrono and std::chrono::nanoseconds are actually just milliseconds padded with 0s on the end (as I observe when I put a system("pause") in the algorithm and unpause at arbitrary times). Please let me know if you find anyway around this or if there is any other way I can achieve higher resolution time.


At the request of @Ulrich Eckhardt, I am including minimal reproducible example as well as the results of the test I preformed using it, and I must say it is rather insightful.

#include<iostream>
#include<chrono>
#include<cmath>

int main()
{
    double c = 1; 

    for(int itter = 1; itter < 10000000; itter *= 10){
        auto start = std::chrono::high_resolution_clock::now();
        for(int i = 0; i < itter; i++)
            c += sqrt(c) + log(c);
        auto end = std::chrono::high_resolution_clock::now();
        int time = std::chrono::duration_cast<std::chrono::nanoseconds>(end-start).count();
        std::cout<<"calculated: "<<c<<". "<<itter<<" iterations took "<<time<<"ns\n";
    }
    system("pause");
}

For my loop I choose a random arbitrary mathematical formula and make sure to use the result of what the loop does so it's not optimized out of existence. Testing it with various iterations on my desktop yields: enter image description here
This seems to imply that a certain threshold is required before the it starts counting time, since dividing the time taken by the first result that yields non-zero time by 10, we get another non-zero time which is not what the result says despite that being how it should work assuming this whole loop is takes O(n) time with n iterations that is. If anything this small example baffles me even further.

Upvotes: 4

Views: 2913

Answers (1)

David Brown
David Brown

Reputation: 532

Switch to steady_clock and you get the correct results for both MSVC and MinGW GCC.

You should avoid using the high_resolution_clock as it is just an alias to either steady_clock or system_clock. For measuring elapsed time in a stop watch like fashion, you always want steady_clock. high_resolution_clock is an unfortunate thing and should be avoided.

I just checked and MSVC has the following:

using high_resolution_clock = steady_clock;

while MinGW GCC has:

/**
 *  @brief Highest-resolution clock
 *
 *  This is the clock "with the shortest tick period." Alias to
 *  std::system_clock until higher-than-nanosecond definitions
 *  become feasible.
*/
using high_resolution_clock = system_clock;

Upvotes: 4

Related Questions