Reputation:

running the program in the Release mode

I want to test the speed of excection of some code in the big loop for example 1000000000 times. When i run the program in the Debug mode , I see the duration of exceution 3.192 sec, but in the release mode the duration is 0.000.I tested by biger counter ,but it was 0.000 sec. This process and result are equal for each greater number that i used for counter such as 10000000000 in the release Mode.
what is the problem?

my code is :

#include<time.h>
#include<iostream>

int main() {
    clock_t t1, t2;



double d1 = 97.9834;
double d2 = 897.9134;
double d3 = 0.0;

t1 = clock();
for (size_t i = 0; i < 1000000000; i++)
{
    d3 = d1 + d2;
}

t2 = clock();

char msg[100];
sprintf_s(msg, 100, "time is :%.3f\n", double(t2 - t1) / 1000.0);
cout << msg << endl;


return 0;
}

Upvotes: 1

Answers (5)

JSF

Reputation: 5321

You need to somehow hide the uselessness of the operation from the optimizer. But since you want to measure how long the operation takes optimized, you would prefer not to destroy the optimization of the operation itself in the course of convincing the optimizer not to remove it entirely.

There isn't much you can do with d3 = d1 + d2; if you make d3 and one of d1 or d2 volatile, you stop the optimizer from eliminating any of the work you want to measure, but at the cost of adding some work you may not want to measure.

In other cases, there are better ways to keep the optimizer from removing what you want to measure. In larger cases, a common method is to put a key part into a different compilation unit so the optimizer can't see what that part doesn't do. But in this case, the call overhead of doing that would be even worse than the extra cost of volatile.

If the question was a simplified "why doesn't measuring time work in release" and you really wanted to measure something more complicated than d3 = d1 + d2; you might get better help asking about what you really want to measure. But if the goal is really d3 = d1 + d2; there is no answer. The time that takes is totally a function of how it is used, including it takes zero time when the optimizer sees it is useless.

Upvotes: 1

user2249683

Reputation:

You have to utilize results of performance tests to prevent unwanted optimizations. A way to do it is making a variable 'volatile':

#include<time.h>
#include<iostream>

volatile double d = 0;
int main() {
    clock_t t1 = clock();
    for (size_t i = 0; i < 1000000000; i++)
    {
        d = 0;
    }
    clock_t t2 = clock();
    std::cout << "time is " << t2 - t1 << std::endl;
    // Debug:   3474486
    // Release:  484208

    return 0;
}

Upvotes: 1

hyde

Reputation: 62906

The compiler is required to produce optimized code, which behaves "as-if" it didn't do any optimizations (with a few exceptions such as fast-math floating point optimizations), except for execution time. You could turn off optimizations, but then you wouldn't be measuring optimized code.

Root of your problem is, you just can't measure what you are trying to measure, or alternatively you are getting the correct result already (loop is optimized away, as it should). However, if you want to force compiler to generate code, there is a keyword for that: volatile

volatile double d1 = 97.9834;
volatile double d2 = 897.9134;
volatile double d3 = 0.0;

Now compiler is required to put variables to memory, and really read value of d1 and d2, and write result to d3, when your addition statement in the loop does that, as many times as the loop iterates. You could only make one or two of these volatile, so compiler could skip some of the code or keep the non-volatile variable in a register, and loop would be faster.

Wether you will be measuring something useful after making some of these variables volatile, that depends on what you actually want to measure...

More info: Purpose of volatile in C and C++ is to tell compiler that this variable is something like memory mapped hardware register, and reading or writing it has some external effect so it must never be optimized away. It is rarely useful in PC programs, and despite common misconception is not related to multi-threaded programming (unlike for example in Java).

Upvotes: 2

EmDroid

Reputation: 6046

Besides the for loop being optimized out completely, the clock() function isn't the best tool in the world to measure time. For starters, what you definitely want to do is to start the measurement on the clock() tick change (the clock() tick granularity is relatively low, compared to high precision counters). Refer to CLOCKS_PER_SEC, which for example is only 1000 in MSVC. When starting the measurements in middle of the clock() ticks, the results will be very inconsistent.

The common pattern is to do this to increase the consistency of the results:

clock_t start;
const clock_t tmp = clock();
do {
    start = clock();
} while (tmp == start);

// execute the code

const clock_t end = clock();

Upvotes: 1

ForceBru

Reputation: 44926

I believe that this for loop is optimized away in production code since it doesn't do anything useful: you never use the value of d3 so why even compute it?

If you want this loop to be executed, turn off the optimization (pass -O0 to the compiler).

Upvotes: 1

running the program in the Release mode

Answers (5)

Related Questions