Reputation: 12865
I have g++ 4.7.3 compiler. I'm trying to follow the optimisation flags description http://gcc.gnu.org/onlinedocs/gcc-4.7.3/gcc/Optimize-Options.html and have a next problem:
I have a program, which gives different times with -O2 and -O3 flag. -O2 is twice faster than -O3. Time is 8ms with O2 and 16ms with O3.
So I would like to understand what exactly makes difference. In the link above I see:
"O3 Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize and -fipa-cp-clone options."
So I simply take -O2 and add all described flags:
-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone
And time is 30ms. But this set of options should be equivalent to -O3. Why time is different? Where do I do something wrong?
P.S. All results are perfectly reproducible with precision of 1ms.
I have checked the options using
g++ -c -Q -Ox --help=optimizers
and saw that O3 has one more additional option: -ftree-loop-distribute-patterns. But when I add it the the options set:
-O2 -finline-functions -funswitch-loops -fpredictive-commoning -fgcse-after-reload -ftree-vectorize -fipa-cp-clone -ftree-loop-distribute-patterns
the speed is still 30ms.
Upvotes: 6
Views: 904
Reputation: 7488
You can get g++
to show you what options is active with the -Q
option:
g++ -c -Q -O3 --help=optimizers
The output is something like:
-O<number>
-Ofast
-Os
-falign-functions [enabled]
-falign-jumps [enabled]
-falign-labels [enabled]
-falign-loops [enabled]
-fasynchronous-unwind-tables [enabled]
-fbranch-count-reg [enabled]
-fbranch-probabilities [disabled]
-fbranch-target-load-optimize [disabled]
-fbranch-target-load-optimize2 [disabled]
-fbtr-bb-exclusive [disabled]
-fcaller-saves [enabled]
-fcombine-stack-adjustments [enabled]
-fcommon [enabled]
-fcompare-elim [enabled]
-fconserve-stack [disabled]
-fcprop-registers [enabled]
-fcrossjumping [enabled]
-fcse-follow-jumps [enabled]
-fcx-fortran-rules [disabled]
-fcx-limited-range [disabled]
-fdata-sections [disabled]
-fdce [enabled]
ETC..
Upvotes: 7