Reputation: 123
I have a question about inline
functions in C++. I know that similar questions have appeared many times on this. I hope that mine is a little bit different.
I know that when you specify some function to be inline
it is just a "suggestion" to the compiler. So in case:
inline int func1()
{
return 2;
}
Some code later
cout << func1() << endl; // replaced by cout << 2 << endl;
So there is no mystery there, but what about cases like this:
inline int func1()
{
return 2;
}
inline int func2()
{
return func1() * 2;
}
inline int func3()
{
return func2() * func1() * 2;
}
And so on...
Which of these functions have a chance to become inlined, is it benefitial and how to check what compiler actually did?
Upvotes: 4
Views: 424
Reputation: 1
Think of inline
as only a hint to the compiler, a bit like register
was in old versions of C++ and C standards. Caveat, register
is being obsoleted (in C++17).
Which of these functions have a chance to become inlined, is it benefitial
Trust your compiler on making sane inlining decisions. To enable some particular occurrence of a call, the compiler needs to know the body of the called function. You should not care if the compiler is inlining or not (in theory).
In practice, with the GCC compiler:
inlining is not always improving the performance (e.g. because of CPU cache issues, TLB, branch predictor, etc etc....)
inlining decisions depends a lot on optimization options. It probably is more likely to happen with -O3
than with -O1
; there are many guru options (like -finline-limit=
and others) to tune it.
notice that individual calls get inlined or not. It is quite possible that some call occurrence like foo(x)
at line 123 is inlined, but another call occurrence (to the same function foo
) like foo(y)
at some other place like line 456 is not inlined.
when debugging, you may want to disable inlining (because that makes the debugging more convenient). This is possible with the -fno-inline
GCC optimization flag (which I often use with -g
, which asks for debugging information).
the always_inline
function attribute "forces" inlining, and the noinline
prevents it.
if you compile and link with link time optimization (LTO) such as -flto -O2
(or -flto -O3
), e.g. with CXX=g++ -flto -O2
in your Makefile
, inlining can happen between several translation units (e.g. C++ source files). However LTO is at least doubling the compilation time (and often, worse) and consumes memory during compilation (so better have a lot of RAM then), and often improve performance by only a few percents (with weird exceptions to this rule of thumb).
you might optimize a function differently, e.g. with #pragma GCC optimize ("-O3")
or with function attribute optimize
Look also into profile-guided optimizations with instrumentation options like -fprofile-generate
and latter optimizations with -fprofile-use
with other optimization flags.
If you are curious about what calls are inlined (and sometimes, some won't be) look into the generated assembler (e.g. use g++ -O2 -S -fverbose-asm
and look in the .s
assembler file), or use some internal dump options.
The observable behavior of your code (except performance) should not depend upon inlining decisions made by your compiler. In other words, don't expect inlining to happen (or not). If your code behave differently with or without some optimization it is likely to be buggy. So read about undefined behavior.
See also MILEPOST GCC project (using machine learning techniques for optimization purposes).
Upvotes: 1
Reputation: 171127
Which of these functions have a chance to become inlined
Any and all functions have a chance to become inlined, if the tool(1) doing the inlining has access to the function's definition (= body) ...
is it benefitial
... and deems it beneficial to do so. Nowadays, it's the job of the optimiser to determine where inlining makes sense, and for 99.9% of programs, the best the programmer can do is stay out of the optimiser's way. The remaining few cases are programs like Facebook, where 0.3% of performance loss is a huge regression. In such cases, manual tweaking of optimisations (along with profiling, profiling, and profiling) is the way to go.
how to check what compiler actually did
By inspecting the generated assembly. Every compiler has a flag to make it output assembly in "human-readable" format instead of (or in addition to) object files in binary form.
(1) Normally, this tool is the compiler and inlining happens as part of the compilation step (turning source code into assembly/object files). That is also the only reason why you may be required to use the inline
keyword to actually allow a compiler to inline: because the function's definition must be visible in the translation unit (= source file) being compiled, and quite often that means putting the function definition into a header file. Without inline
, this would then lead to multiple-definition errors if the header file was included in more than one translation unit.
Note that compilation is not the only stage when inlining is possible. When you enable Whole-Program Optimisation (also known as Link-Time Code Generation), one more pass of optimisation happens at link time, once all object files are created. At this point, the inline
keyword is totally irrelevant, since linking has access to all the function definitions (the binary wouldn't link successfully otherwise). This is therefore the way to get the most benefit from inlining without having to think about it at all when writing code. The downside is time: WPO takes time to run, and for large projcts, can prolong link times to unacceptable levels (I've personally experienced a somewhat pathological case where enabling WPO took a program's link time from 7 minutes to 46).
Upvotes: 4