user768417
user768417

Reputation:

How much should I optimize?

In regards to optimizations done by the compiler (GCC), what is the standard practice? What does each option (-O, -O1, -O2, -O3, -Os, -s, -fexpensive-optimizations) do differently, and how do I decide what is optimal?

Upvotes: 12

Views: 3183

Answers (2)

roim
roim

Reputation: 4930

Usually -O2 is a good optimization level to try first.

However, if you want the best result possible, you will end up trying many optimization levels as you can't tell beforehand which level will be best for your app.

Also take note that optimization results should vary with each CPU (on some CPUs optimizing for size might actually yield a better speed than optimizing for speed).

Just for future reference, here's a brief description of each level (you can find the complete description in the documentation http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html):

-O (identical to -O1): With -O, the compiler tries to reduce code size and execution time, without performing any optimizations that take a great deal of compilation time.

-O2: Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. As compared to -O, this option increases both compilation time and the performance of the generated code.

-O3: Optimize yet more. -O3 turns on all optimizations specified by -O2 and also turns on the -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload, -ftree-vectorize, -ftree-partial-pre and -fipa-cp-clone options.

-Os: Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size.

-Ofast: Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard compliant programs. It turns on -ffast-math and the Fortran-specific -fno-protect-parens and -fstack-arrays. If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.

Upvotes: 8

sarnold
sarnold

Reputation: 104080

The Linux kernel's Makefile provides for both -O2 and -Os. Either one would be appropriate lacking further details.

The -Os optimizes for small storage. Since CPUs are significantly faster than main memory these days, optimizing for small storage makes sense even on huge machines -- any time spent waiting for the cache to be populated from main memory is wasted time. So make the most use of the instruction cache by compiling for space efficiency and perhaps the execution time will also improve.

The -O2 runs all the "usual optimizations", and the optimizations chosen are going to be safe. (I've heard that some of the -O3 optimizations aren't always safe, but that may be because the Linux kernel runs with some constraints not common to usual applications.)

The best answer, of course, is to compile your software with multiple levels of optimization; time how long it takes to compile the software and time how long it takes for the software to run through a representative benchmark tests. Measure how much memory is used for them all.

Then pick the "best" combination of compilation speed, run time speed, and run time memory use. You might want fastest compiles or you might want fastest run times, or you might be trying to fit into a smaller amount of memory from a virtual hosting provider to save money.

It's probably fair to just pick -O2 without doing any measurements.

Upvotes: 2

Related Questions