Reputation: 3476
Although I know each and every program is a different scenario, I have a rather specific question considering the below table.
g++
Ox WHAT IS BEING OPTIMIZED EXEC CODE MEM COMP
TIME SIZE TIME
------------------------------------------------------------------------------
O0 optimize for compilation time | + + - -
O1 optimize for code size and execution time #1 | - - + +
O2 optimize for code size and execution time #2 | -- 0 + ++
O3 optimize for code size and execution time #3 | --- 0 + +++
Ofast O3 with fast none accurate math calculations | --- 0 + +++
Os optimize for code size | 0 -- 0 ++
+increase ++increase more +++increase even more -reduce --reduce more ---reduce even more
I am using version 8.2, though this should be a generic table taken from here and re-written into a plain text.
My question is, if it that can be trusted, I don't know that web site, so I better ask the professionals here. So, is the table more or less accurate?
Upvotes: 0
Views: 1822
Reputation: 1
Your table is grossly accurate.
Notice that GCC has zillions of optimization options. Some weird optimization passes are not even enabled at -O3
(but GCC has several hundreds of optimization passes).
But there is no guarantee than an -O3
optimization always give code which runs faster than the same code compiled with -O2
. This is generally the case, but not always. You could find pathological (or just) weird C source code which, when compiled with -O3
, gives a slightly slower binary code than the same C source code compiled with -O2
. For example, -O3
is likely to unroll loops "better" -at least "more"- than -O2
, but some code might perform worse if some particular loop in it is more unrolled. The phoronix website and others are benchmarking GCC and are observing such phenomenon.
Be aware that optimization is an art, it is in general an intractable or undecidable problem, and that current processors are so complex that there is no exact and complete model of their performance (think of cache, branch predictors, pipeline, out-of-order execution). Beside, the detailed micro-architecture of x86 processors is obviously not public (you cannot get the VHDL or chip layout of Intel or AMD chips). Hence, the -march=
option to GCC also matters (the same binary code is not always good on both AMD & Intel chips, or even on several brands of Intel processors). So, if compiling code on the same machine that runs it, passing -march=native
in addition of -O2
or -O3
is recommended.
People paid by Intel and by AMD are actively contributing to GCC, but they are not allowed to share all the knowledge they have internally about Intel or AMD chips. They are allowed to share (with the GPLv3+ license of GCC) the source code they are contributing to the GCC compiler. Probably engineers from AMD are observing the Intel-contributed GCC code to guess micro-architectural details of Intel chips, and vice versa.
And Intel or AMD interests obviously include making GCC working well with their proprietary chips. That corporate interests justify paying (both at Intel and at AMD) several highly qualified compiler engineers contributing full time to GCC.
In practice, I observed that both AMD and Intel engineers are "playing the game" of open source: they routinely contribute GCC code which also improves their competitor's performance. This is more a social & ethical & economical issue than a technical one.
PS. You can find many papers and books on the economics of open source.
Upvotes: 2