Reputation: 649
The GCC manual lists all optimization flags being applied for the different levels of optimizations (-O1, -O2, etc.). However when compiling and measuring a benchmark program (e.g. cBench's automotive_bitcount) there is a significant difference when applying an optimization level instead of turning on all the listed optimizations manually. For -O1 with the automotive_bitcount program, I measured a speedup of roughly 100% when compiling with -O1 instead of manually applying all the listed flags. Those "hidden" optimizations seem in fact to be the main part of the optimization work GCC does for -O1. When applying the flags manually, I only get a speedup of about 10% compared to no optimizations.
The same can be observed when applying all enabled flags from gcc -c -Q -O3 --help=optimizers
.
In the GCC manual I found this section which would explain this behavior:
Not all optimizations are controlled directly by a flag. Only optimizations that have a flag are listed in this section. Most optimizations are completely disabled at -O0 or if an -O level is not set on the command line, even if individual optimization flags are specified.
Since I couldn't find any further documentation on those optimizations, I wonder if there is a way of controlling them and what the optimizations are in detail?
Upvotes: 1
Views: 170
Reputation: 21878
Some optimizations are directly gated by -O
flags e.g. complete unroller:
{
public:
pass_complete_unrolli (gcc::context *ctxt)
: gimple_opt_pass (pass_data_complete_unrolli, ctxt)
{}
/* opt_pass methods: */
virtual bool gate (function *) { return optimize >= 2; }
virtual unsigned int execute (function *);
}; // class pass_complete_unrolli
and for others -O
influences their internal algorithms e.g. in optimization of expressions:
/* If FROM is a SUBREG, put it into a register. Do this
so that we always generate the same set of insns for
better cse'ing; if an intermediate assignment occurred,
we won't be doing the operation directly on the SUBREG. */
if (optimize > 0 && GET_CODE (from) == SUBREG)
from = force_reg (from_mode, from);
There is no way to work around this, you have to use -O
.
Upvotes: 1