Reputation: 1052
How to avoid compiler optimizing some operation?
For example, if i implement my own sprintf2
, i want to compare the performance of my sprintf2
and stdlib's sprintf
, so i wrote this code:
#include<iostream>
#include<string>
#include<ctime>
using namespace std;
int main()
{
char c[50];
double d=-2.532343e+23;
int MAXN=1e8;
time_t t1,t2,t3;
t1=clock();
for(int i=0;i<MAXN;i++)
sprintf2(c,"%16.2e",d);//my own implemention of sprintf
t2=clock();
for(int i=0;i<MAXN;i++)
sprintf(c,"%16.2e",d);
t3=clock();
printf("sprintf2:%dms\nsprintf:%dms\n",t2-t1,t3-t2);
return 0;
}
It turns out:
sprintf2:523538ms//something big, i forgot
sprintf:0ms
As we know, sprintf
costs time, and MAXN is so big, so t3-t2
shouldn't be 0
.
As we don't use array c
, and each time d
is the same, so i guess compiler optimized it and sprintf
only did once.
So here is the question, how can i measure the real time that 1e8sprintf
cost?
Upvotes: 1
Views: 1339
Reputation: 1
The compiler optimized the calls to sprintf
because you did not use the result, and because it is printing always the same number. So change also the printed number (since if you call the same sprintf
in a loop the compiler is allowed to optimize and move the sprintf
before the loop)
So just use the result, e.g. by computing a (meaningless) sum of some of the characters.
int s=0;
memset(c, 0, sizeof(c));
for(int i=0;i<MAXN;i++) {
sprintf2(c,"%16.2e",d+i*1.0e-9);
s+=c[i%8];
};
t2=clock();
for(int i=0;i<MAXN;i++) {
sprintf(c,"%16.2e",d+i*1.0e-9);
s+=c[i%8];
}
t3=clock();
printf("sprintf2:%dms\nsprintf:%dms\ns=%d\n",t2-t1,t3-t2,s);
t3=clock();
then you should be able to benchmark and to compile. You probably want to display the time cost of every call:
printf("sprintf2:%f ms\nsprintf:%f ms\n",
1.0e3*(t2-t1)/(double)maxn, 1.0e3*(t3-t2)/(double)maxn);
since POSIX requires that CLOCKS_PER_SEC
equals 1000000, so a clock
tick is one microsecond.
BTW, MAXN
(which should be spelt in lower cases, all uppercases is conventionally for macros!) could be some input (otherwise a clever optimizing compiler could unroll the loop at compile time), e.g.
int main(int argc, char**argv) {
int maxn = argc>1 ? atoi(argv[1]) : 1000000;
Notice that when you are benchmarking, you really should ask the compiler to optimize with -O2
. Measuring the speed of unoptimized code is meaningless.
And you can always look at the assembler code (e.g. gcc -O2 -fverbose-asm -S
) and check that sprintf2
and sprintf
are indeed called in a loop.
BTW on my Linux Debian/Sid/x86-64 i7 3770K desktop:
/// file b.c
#include <stdio.h>
#include <time.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char**argv) {
int s=0;
char buf[50];
memset(buf, 0, sizeof(buf));
int maxn = (argc>1) ? atoi(argv[1]) : 1000000;
clock_t t1 = clock();
for (int i=0; i<maxn; i++) {
snprintf(buf, sizeof(buf), "%12.3f",
123.45678+(i*0.01)*(i%117));
s += buf[i%8];
};
clock_t t2 = clock();
printf ("maxn=%d s=%d deltat=%.3f sec, each iter=%.3f µsec\n",
maxn, s, (t2-t1)*1.0e-6, ((double)(t2-t1))/maxn);
return 0;
}
compiled as gcc -std=c99 -Wall -O3 b.c -o b
(GCC is 4.9.2, Glibc is 2.19) gives the following consistent timings:
% time ./b 4000000
maxn=4000000 s=191871388 deltat=2.180 sec, each iter=0.545 µsec
./b 4000000 2.18s user 0.00s system 99% cpu 2.184 total
% time ./b 7000000
maxn=7000000 s=339696631 deltat=3.712 sec, each iter=0.530 µsec
./b 7000000 3.71s user 0.00s system 99% cpu 3.718 total
% time ./b 6000000
maxn=6000000 s=290285020 deltat=3.198 sec, each iter=0.533 µsec
./b 6000000 3.20s user 0.00s system 99% cpu 3.203 total
% time ./b 6000000
maxn=6000000 s=290285020 deltat=3.202 sec, each iter=0.534 µsec
./b 6000000 3.20s user 0.00s system 99% cpu 3.207 total
BTW, see this regarding Windows clock
implementation (which might be perceived as buggy). You might be as happy as I am with installing and using Linux on your machine (I never used Windows, but I am using Unix or POSIX like systems since 1987).
Upvotes: 2
Reputation: 6555
At least in GCC the optimisation is stated in the documentation as not even turned on by default
Most optimizations are only enabled if an -O level is set on the command line. Otherwise they are disabled, even if individual optimization flags are specified.
As you can read here
https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
But I can't share this impression.
So if by not specifing an -O parameter (or for MSVC you can just set the optimisation level in the properties, I remember there was a flag "no optimisation") not the expected behaving takes place, I would say, there is no way for turning off the optimisations in a way you want it.
But remember, the compiler is doing a lot of optimisation stuff, where you can't even directly do in the code. So there isn't even a reason for "turning off everything" if it is that what you are interested in. So by documentation the latter seems to be not possible.
Upvotes: 0