tierriminator
tierriminator

Reputation: 649

GCC hidden optimizations

The GCC manual lists all optimization flags being applied for the different levels of optimizations (-O1, -O2, etc.). However when compiling and measuring a benchmark program (e.g. cBench's automotive_bitcount) there is a significant difference when applying an optimization level instead of turning on all the listed optimizations manually. For -O1 with the automotive_bitcount program, I measured a speedup of roughly 100% when compiling with -O1 instead of manually applying all the listed flags. Those "hidden" optimizations seem in fact to be the main part of the optimization work GCC does for -O1. When applying the flags manually, I only get a speedup of about 10% compared to no optimizations. The same can be observed when applying all enabled flags from gcc -c -Q -O3 --help=optimizers.

In the GCC manual I found this section which would explain this behavior:

Not all optimizations are controlled directly by a flag. Only optimizations that have a flag are listed in this section. Most optimizations are completely disabled at -O0 or if an -O level is not set on the command line, even if individual optimization flags are specified.

Since I couldn't find any further documentation on those optimizations, I wonder if there is a way of controlling them and what the optimizations are in detail?

Upvotes: 1

Views: 170

Answers (1)

yugr
yugr

Reputation: 21878

Some optimizations are directly gated by -O flags e.g. complete unroller:

{
public:
  pass_complete_unrolli (gcc::context *ctxt)
    : gimple_opt_pass (pass_data_complete_unrolli, ctxt)
  {}

  /* opt_pass methods: */
  virtual bool gate (function *) { return optimize >= 2; }
  virtual unsigned int execute (function *);

}; // class pass_complete_unrolli

and for others -O influences their internal algorithms e.g. in optimization of expressions:

         /* If FROM is a SUBREG, put it into a register.  Do this
            so that we always generate the same set of insns for
            better cse'ing; if an intermediate assignment occurred,
            we won't be doing the operation directly on the SUBREG.  */
         if (optimize > 0 && GET_CODE (from) == SUBREG)
           from = force_reg (from_mode, from);

There is no way to work around this, you have to use -O.

Upvotes: 1

Related Questions