Reputation:
I wrote the well known swap function in C and watched the assembly output using gcc S and once again did the same but with optimizations of O2
The difference was pretty big as I saw only 5 lines compared to 20 lines.
My question is, If optimisation really helps what's the reason of not using it all the time? Why we non non-optimized compilation of code?
An extra question to those working in the industry, when you release the final version of your program after testing it do you compile with optimizations on?
I am responding to all your comments, please read them.
Upvotes: 1
Views: 1720
Reputation: 21325
"If optimisation really helps what's the reason of not using it all the time?" OP question seems misplaced. What is the goal of optimization?
In support of the other answers, note that the GCC docs stipulate:
Turning on optimization flags makes the compiler attempt to improve the performance and/or code size at the expense of compilation time and possibly the ability to debug the program.
In addition to the many good points raised in other answers about why optimization may not be used "all the time", there is the question of which optimizations should be turned on. OP question seems to really be about what should be the default level of optimization for, e.g., GCC.
Yes, execution speed is sometimes a priority in release code, but not always. Sometimes the size of a binary takes priority. There are many fine-grained optimization options available. Which of these should be available by default? GCC has in fact made that choice: -O0
is the optimization level by default, appropriate for compile/debug/edit cycles where compiling fast and with fully consistent debugging are both useful.
A quick look at GCC 12.2 using --help=optimizers
to view the enabled optimization flags shows that there are 266 optimization flags available. (Some flags are vastly more important and potentially impactful than others for both compile time and performance, like -ftree-vectorize
for automatic use of SIMD, but counting number of separate flags is easy to do). Note that many of these -f
options only take effect when an optimization level other than -O0
has been selected. The -O
optimization levels enable some optimizations that don't have corresponding flags or documentation, but of the ones that do have flags:
-O0
(the default level) enables 49 optimization flags. Many other -f
flags have no effect at this level. There are qualitative differences in code-gen between -O0
and any other level.-O1
enables 90 optimization flags (optimize a bit while compiling fast, a bit more than -Og
)-Os
enables 127 optimization flags (optimize size and speed, vs. -Oz
aggressive for size)-O2
enables 134 optimization flags (optimize for speed)-O3
enables 147 optimization flags (optimize aggressively for speed)-Ofast
enables 149 optimization flagsNote that the counts above are not strictly accurate since some of the enabled flags themselves enable a number of finer-grained flags.
If the goal is speed, perhaps -Ofast
should be the default. Yet this flag enables ffast-math
which can cause subtle problems for numerically sensitive code and for multi-threaded code; this is probably not a good default. Sometimes optimizations involve trade-offs which may be significant. GCC has already made a choice for what they consider to be a useful general default optimization level for development cycles. Other optimization levels may be appropriate during development, but these may also involve further specifics which may not always be the best choices. OP's "what's the reason of not using it all the time" simply paints the question with too broad of a brush, and even OP's comments about "but I'm talking about speed optimizations" is too broad as there are many detailed considerations that may be relevant here.
Upvotes: 1
Reputation: 93476
If you never use a source level debugger you probably could. But if you never use a source level debugger, you probably should.
Unoptimized code has a direct one-to-one correspondence to statements expressions and variables in the source code, so when stepping through the code, it all makes sense - all the lines are executed in the order you would expect and all variables have a valid state when you would expect them to do so.
Optimised code on the other hand can eliminate code and variables, and reorder execution and generally render source level debugging a nonsense. Sometimes you get a bug that only appears in an optimised build, so you may have to deal with it, but generally such things are a result of undefined behaviour, and it is generally better to avoid that in the first instance.
One thing to consider is that in development you have performed all your testing and development on unoptimized code; so you could debug it. If, on the day you release it you crank up the optimiser and ship it, you are essentially shipping a whole lot of untested code. Testing is hard, and you really should test what you release, so between building and releasing you may have a lot of work to do to eliminate the risk. Releasing to the same build spec that you have been testing every day throughout development may be lower risk.
For code running on a desktop responding to and waiting for user input, or which is disk or network I/O bound, making the code faster or smaller often serves little purpose. There may be specific parts of a large application that will benefit such as sorting or searching algorithms on large data sets, or image or audio processing, and for those you might use targeted rather then whole application optimisation.
In embedded systems where often you are using processors much slower than Desktop systems with much smaller memory resources, optimisation for both speed and size may be critical, but even there the code normally has to both fit and meet real-time deadlines in its debug build in order to support test and debugging. If it only works optimised, it will be much harder to debug.
Apart from optimising your code, it should perhaps be noted that in order to do that job, the optimiser has to perform a much deeper analysis of the code through techniques such as abstract execution, and in doing so can find bugs and issue warnings that normal compilation will not detect. For example the optimiser is rather good at detecting variables that may be used before they are initialised. To that end, I would recommend switching on and max'ing the optimiser as a kind of "poor man's" static analysis, even if you use a lower optimisation level for release - for the reasons given earlier.
The optimiser is also the most complex part of any compiler; if the compiler is going to have a bug, it is likely to be in the optimiser. That said I have only ever encountered one such confirmed bug, in Microsoft C v6.0 in 1989! More often what at first appears to be a compiler bug turns out to be undefined behaviour or latent bugs in the source being compiled that manifest themselves with different code generation options.
Upvotes: 4
Reputation: 81159
The C Standard allows compilers to, as a form of "conforming language extension", specify their behavior in more corner cases than mandated by the Standard--typically by processing them "in a documented manner characteristic of the environment" when doing so would be useful and there would be no compelling reason to do otherwise. In many cases, such behavior makes it possible to accomplish tasks more efficiently than would otherwise be possible.
Some compilers like gcc and clang, however, design optimizations around the assumption that programs will never receive inputs where the aforementioned conforming language extensions would be relevant, and will go out of their way to "optimize out" any constructs which would only be relevant in such scenarios. Disabling such optimizations will make such compilers compatible with programs that rely upon the "certain popular extensions" the authors of the Standard alluded to in the published Rationale.
Upvotes: 0
Reputation: 31379
There are a few reasons.
For small and even medium sized projects, this is rarely an issue today. Modern computers are VERY fast. If it takes five or ten seconds usually does not matter. But for larger projects it does matter. Especially if the build process is not setup properly. I remember when I was trying to add a feature to the game The Battle for Wesnoth. Compilation took around ten minutes. It's easy to see how much you would want to reduce that to five minutes or lower if you could.
The reason that it makes code harder to debug is that the debugger does not run the program line by line. That's just an illusion. Here is an example where it might be a problem:
int main(void) {
char str[] = "Hello, World!";
int number_of_capital_letters = 0;
for(int i=0; i<strlen(str); i++) {
if(isupper(str[i]))
number_of_capital_letters++;
}
printf("%s\n", str);
// Outcommented for debugging reasons
// printf("%d\n", number_of_capital_letters);
}
You fire up your debugger and wonders why it does not keep track of number_of_capital_letters
. And then you find out that since you have commented out the last printf
statement, the variable is not used for any observable behavior so the optimizer changes your code to:
int main(void) {
puts("Hello, World!");
}
One could argue that you then just turn off optimizer for a debug build. And that's true in the world when a cow is a sphere. But a third reason is
Imagine that you have a big code base. When you upgrade the compiler, a bug suddenly emerges. And it seems to vanish when you remove optimization. What's the problem here? Well, it could be a bug in the optimizer. But it could also be a bug in your code that manifested itself with the new version of the optimizer. Very often, code with undefined behavior behaves different in code compiled with optimization.
So what do you do? You could try to figure out if the bug is in the optimizer or your code. That can be a VERY time consuming task. Let's assume it's a bug in the optimizer. What to do? You could downgrade your compiler, which is not optimal for several reasons. Especially if it's an open source project. Imagine downloading the source and then run the build script and scratching your head for hours to figure out what's wrong, and then you see in some documentation (provided that the author documented it) that you need a specific version of a specific compiler.
Let's instead assume it's a bug in your code. The ideal thing is of course to fix it. But maybe you don't have the resources to do so. This time you can also require anyone who compiles it to use a certain version of a specific compiler.
But if you could just edit a Makefile and replace -O3
with -O2
, you can clearly see that it's a viable option sometimes in our non-ideal world where time is not an endless resource. With a bit of bad luck, such a bug can take a week to track down. Or more. That's time you can spend somewhere else.
Here is an example of such a bug:
#include <stdio.h>
int main(void) {
char str[] = "Hello";
str[5] = '!';
puts(str);
}
When I compiled this with gcc 10.2 I got different results depending on optimization level.
Without optimization:
Hello!
With optimization:
Hello!`@
Try it out yourself:
https://godbolt.org/z/5dcKKrEW1
https://godbolt.org/z/48bz5ae1d
And here I found a forum thread where the debug build works but not release: https://developer.apple.com/forums/thread/15112
Yep, that may also happen. In this case, you could just increase the optimization if you don't care that much about correctness. But if you do care, this can be a way to find bugs. If your code runs correctly both with and without optimization, it's more likely to not contain bugs that will haunt you in the future compared to if you only have compiled with optimization.
I did not find an example that worked, but this might theoretically do.
int main(void) {
if(1/0) // Division by zero
puts("An error has occurred");
else
puts("Everything is fine");
}
If this is compiled without optimization, it's a high probability that it will crash. But the optimizer might assume that undefined behavior (like division by zero) never occurs, so it optimizes the code to just:
int main(void) {
puts("Everything is fine");
}
Assume that 1/0
is some kind of error check that is very unlikely to evaluate to true, so you would normally assume the program prints "Everything is fine". Here, the optimizer hides a bug.
This sometimes matters. Especially in embedded systems. Usually (always) -O0
produces very big code, but you might want to use -Os
(optimize for size instead of speed) instead of -O3
to get a small binary. And sometimes also to get faster code. See below.
Yep, really. It's not often, but it may happen. A related but not equivalent example is illustrated in this question where the compiler generates faster code when optimizing for size of executable than speed.
Upvotes: 8
Reputation: 47952
One reason is probably just: tradition. The first C compiler was written for the DEC PDP-11, which had a 64k address space. (That's right, a tenth of that famous but mythical old IBM PC quote about "640k should be enough for anybody".) The first C compiler ran as quite a number of separate programs or passes: there was the preprocessor cpp
, the parser c0
, the code generator c1
, the assembler as
, and the linker ld
. If you asked for optimization, it ran as a separate pass c2
which was a "peephole optimizer" operating on c1
's output, before passing it to as
.
Compilation was much slower in those days than it is today (because of course the processors were much slower). People didn't routinely request optimization for everyday work, because it really did cost you something significant in your edit/compile/debug cycle.
And although a whole lot has changed since then, the fact that optimization is something extra, something special, that you have to request explicitly, lives on.
Upvotes: 2
Reputation: 4431
Personally I usually have optimisation turned on.
My reasons are:
The shipped code is built with optimisation as we need the -- especially numerical -- performance. Since you can't ship what you haven't tested the test version must also be optimised. It would I suppose be possible to build without optimisation during development but I begrudge the extra time to then build with optimisation, and test, prior to release to test. Moreover performance is sometimes part of the spec, so some development test has to be done with optimised code.
I don't find using a debugger so very tough with optimised code. Mind you, given the kind of programs I mostly write -- fancy filters without user interfaces and numerical libraries -- printf and valgrind (which works fine with optimised code) are my preferred tools.
In recent versions of gcc, at least, more and better diagnostics are produced with optimisation on rather than off.
This, like so much else in programming, will of course vary with circumstances.
Upvotes: 3