AwaitedOne
AwaitedOne

Reputation: 1012

Profiling a C++ program using high_resolution_clock

I want to profile my C++ program, and used high_resolution_clock for that purpose. A sample code for that matter is provided as an example. I tried three different ways.

Example 1

#include<iostream>
#include <chrono>

using namespace std;
using namespace chrono;

unsigned int ttime = 0;
int main(){
    int i = 0;
    int j = 0;
    while(i < 10000000){
        high_resolution_clock::time_point t1 = high_resolution_clock::now();
        j += i;     
        i++;
        high_resolution_clock::time_point t2 = high_resolution_clock::now();
        auto tm_duration = duration_cast<microseconds>(t2 - t1).count();
        ttime += tm_duration;
    }
    cout << "Took " << ttime << " microseconds " << endl;
    return 0;
}

This example use clock inside the loop and works well, gives results as expected.

Example 2

#include<iostream>
#include <chrono>

using namespace std;
using namespace chrono;

unsigned int ttime = 0;
int main(){
    int i = 0;
    int j = 0;
    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    while(i < 10000000){
        j += i;     
        i++;
    }
    high_resolution_clock::time_point t2 = high_resolution_clock::now();
    auto tm_duration = duration_cast<microseconds>(t2 - t1).count();
    ttime += tm_duration;
    cout << "Took " << ttime << " microseconds " << endl;
    return 0;
}

This example shows 0 time, which I doubt about.

Example 3

#include<iostream>
#include <chrono>

using namespace std;
using namespace chrono;

unsigned int ttime = 0;
int main(){
    int i = 0;
    int j = 0;
    high_resolution_clock::time_point t1 = high_resolution_clock::now();
    while(i < 10000000){
        j += i;     
        i++;
        high_resolution_clock::time_point t2 = high_resolution_clock::now();
        auto tm_duration = duration_cast<microseconds>(t2 - t1).count();
        ttime += tm_duration;
    }   
    cout << "Took " << ttime << " microseconds " << endl;
    return 0;
}

This example shows 3792420263 microseconds which I also doubt about.

What is the problem in example 2 and example 3. Which out of the three is correct.

Upvotes: 2

Views: 4665

Answers (2)

vasek
vasek

Reputation: 2839

Example 2 gets optimized by compiler so if you want to see loop run time, you should disable any compiler optimizations.

Example 3 makes no sense since you add a difference time in every loop and the result is number that says nothing about loop run time. Furthermore you encounter many overflows during your loop with 32-bit unsigned int ttime.

So the best solution for profiling real code is Example 2. Do not worry about your zero output, if you add any "reasonable code" between t1 and t2 creation, you get better number.

Upvotes: 2

Freakyy
Freakyy

Reputation: 365

The compiler optimizes those loops away. If you disable Optimizations in the Options menu it will show plausible values.

Upvotes: 0

Related Questions