Reputation: 603
I am working on a project where I implement two popular MST algorithms in C++ and then print how long each one takes to execute. Please ignore the actual algorithms, I have already tested them and am only interested in getting accurate measurements of how long they take.
void Graph::krushkalMST(bool e){
size_t s2 = size * size;
typedef struct{uint loc; uint val;} wcType; //struct used for storing a copy of the weights values to be sorted, with original locations
wcType* weightsCopy = new wcType[s2]; //copy of the weights which will be sorted.
for(int i = 0; i < s2; i++){
weightsCopy[i].loc = i;
weightsCopy[i].val = weights[i];
}
std::vector<uint> T(0); //List of edges in the MST
auto start = std::chrono::high_resolution_clock::now(); //time the program was started
typedef int (*cmpType)(const void*, const void*); //comparison function type
static cmpType cmp = [](const void* ua, const void* ub){ //Compare function used by the sort as a C++ lambda
uint a = ((wcType*)ua)->val, b = ((wcType*)ub)->val;
return (a == b) ? 0 : (a == NULLEDGE) ? 1 : (b == NULLEDGE) ? -1 : (a < b) ? -1 : 1;
};
std::qsort((void*)weightsCopy, s2, sizeof(wcType), cmp); //sort edges into ascending order using a quick sort (supposedly quick sort)
uint* componentRefs = new uint[size]; //maps nodes to what component they currently belong to
std::vector<std::vector<uint>> components(size); //vector of components, each component is a vector of nodes;
for(int i = 0; i < size; i++){
//unOptimize(components);
components[i] = std::vector<uint>({(uint)i});
componentRefs[i] = i;
}
for(int wcIndex = 0; components.size() >= 2 ; wcIndex++){
uint i = getI(weightsCopy[wcIndex].loc), j = getJ(weightsCopy[wcIndex].loc); //get pair of nodes with the smallest edge
uint ci = componentRefs[i], cj = componentRefs[j]; //locations of nodes i and j
if(ci != cj){
T.push_back(weightsCopy[wcIndex].loc); //push the edge into T
for(int k = 0; k < components[cj].size(); k++) //move each member in j's component to i's component
components[ci].push_back(components[cj][k]);
for(int k = 0; k < components[cj].size(); k++) //copy this change into the reference locations
componentRefs[components[cj][k]] = ci;
components.erase(components.begin() + cj); //delete j's component
for(int k = 0; k < size; k++)
if(componentRefs[k] >= cj)
componentRefs[k]--;
}
}
auto end = std::chrono::high_resolution_clock::now();
uint time = std::chrono::duration_cast<std::chrono::nanoseconds>(end-start).count();
std::cout<<"\nMST found my krushkal's Algorithm:\n";
printData(time, T, e);
delete[] weightsCopy;
delete[] componentRefs;
}
void Graph::primMST(bool e){
std::vector<uint> T(0); //List of edges in the MST
auto start = std::chrono::high_resolution_clock::now(); //Start calculating the time the algorithm takes
bool* visited = new bool[size]; //Maps each node to a visited value
visited[0] = true;
for(int i = 1; i < size; i++)
visited[i] = false;
for(uint numVisited = 1; numVisited < size; numVisited++){
uint index = 0; //index of the smallest cost edge to unvisited node
uint minCost = std::numeric_limits<uint>::max(); //cost of the smallest edge filling those conditions
for(int i = 0; i < size; i++){
if(visited[i]){
for(int j = 0; j < size; j++){
if(!visited[j]){
uint curIndex = i * size + j, weight = dweights[curIndex];
if(weight != NULLEDGE && weight < minCost){
index = curIndex;
minCost = weight;
}
}
}
}
}
T.push_back(index);
visited[getI(index)] = true;
}
auto end = std::chrono::high_resolution_clock::now();
uint time = std::chrono::duration_cast<std::chrono::microseconds>(end-start).count();
std::cout<<"\nMST found my Prim's Algorithm:\n";
printData(time, T, e);
delete[] visited;
}
I initially used clock()
from <ctime>
to try and get an accurate measurement of how long this would take, my largest test file has a graph of 40 nodes with 780 edges (sufficiently large enough to warrant some compute time), and even then on a slow computer using g++ with -O0
i would get either 0 or 1 milliseconds. On my desktop I was only ever able to get 0 ms, however as I need a more accurate way to distinguish time between test cases I decided I would try for the high_resolution_clock
provided by the <chrono>
library.
This is where the real trouble began, I would (and still) consistently get that the program took 0 nanoseconds to execute.
In my search for a solution I came across multiple questions that deal with similar issues, most of which state that
<chrono>
is system dependent and you're unlikely to actually be able to get nanosecond or even microsecond values. Never the less, I tried using std::chrono::microsecond
only to still consistently get 0. Eventually I found what I thought was someone who was having the same problem as me:
counting duration with std::chrono gives 0 nanosecond when it should take long
However, this is clearly a problem of an overactive optimizer which has deleted an unnecessary piece of code, whereas in my case the end result always depends on the results for series of complex loops which must be executed in full. I am on Windows 10, compiling with GCC using -O0
.
My best hypothesis is I'm doing something wrong or that windows doesn't support anything smaller then milliseconds while using std::chrono and std::chrono::nanoseconds
are actually just milliseconds padded with 0s on the end (as I observe when I put a system("pause")
in the algorithm and unpause at arbitrary times). Please let me know if you find anyway around this or if there is any other way I can achieve higher resolution time.
At the request of @Ulrich Eckhardt, I am including minimal reproducible example as well as the results of the test I preformed using it, and I must say it is rather insightful.
#include<iostream>
#include<chrono>
#include<cmath>
int main()
{
double c = 1;
for(int itter = 1; itter < 10000000; itter *= 10){
auto start = std::chrono::high_resolution_clock::now();
for(int i = 0; i < itter; i++)
c += sqrt(c) + log(c);
auto end = std::chrono::high_resolution_clock::now();
int time = std::chrono::duration_cast<std::chrono::nanoseconds>(end-start).count();
std::cout<<"calculated: "<<c<<". "<<itter<<" iterations took "<<time<<"ns\n";
}
system("pause");
}
For my loop I choose a random arbitrary mathematical formula and make sure to use the result of what the loop does so it's not optimized out of existence. Testing it with various iterations on my desktop yields:
This seems to imply that a certain threshold is required before the it starts counting time, since dividing the time taken by the first result that yields non-zero time by 10, we get another non-zero time which is not what the result says despite that being how it should work assuming this whole loop is takes O(n) time with n iterations that is. If anything this small example baffles me even further.
Upvotes: 4
Views: 2913
Reputation: 532
Switch to steady_clock
and you get the correct results for both MSVC and MinGW GCC.
You should avoid using the high_resolution_clock
as it is just an alias to either steady_clock
or system_clock
. For measuring elapsed time in a stop watch like fashion, you always want steady_clock
. high_resolution_clock
is an unfortunate thing and should be avoided.
I just checked and MSVC has the following:
using high_resolution_clock = steady_clock;
while MinGW GCC has:
/**
* @brief Highest-resolution clock
*
* This is the clock "with the shortest tick period." Alias to
* std::system_clock until higher-than-nanosecond definitions
* become feasible.
*/
using high_resolution_clock = system_clock;
Upvotes: 4