Reputation: 171
Run the following codes:
Case 1:
#include <stdio.h>
int count=0;
void g(void){
printf("Called g, count=%d.\n",count);
}
#define EXEC_BUMP(func) (func(),++count)
typedef void(*exec_func)(void);
inline void exec_bump(exec_func f){
f();
++count;
}
int main(void)
{
//int count=0;
while(count++<10){
EXEC_BUMP(g);
//exec_bump(g);
}
return 0;
}
Case 2:
#include <stdio.h>
int count=0;
void g(void){
printf("Called g, count=%d.\n",count);
}
#define EXEC_BUMP(func) (func(),++count)
typedef void(*exec_func)(void);
inline void exec_bump(exec_func f){
f();
++count;
}
int main(void)
{
//int count=0;
while(count++<10){
//EXEC_BUMP(g);
exec_bump(g);
}
return 0;
}
Case 3:
#include <stdio.h>
int count=0;
void g(void){
printf("Called g, count=%d.\n",count);
}
#define EXEC_BUMP(func) (func(),++count)
typedef void(*exec_func)(void);
inline void exec_bump(exec_func f){
f();
++count;
}
int main(void)
{
int count=0;
while(count++<10){
//EXEC_BUMP(g);
exec_bump(g);
}
return 0;
}
Case 4:
#include <stdio.h>
int count=0;
void g(void){
printf("Called g, count=%d.\n",count);
}
#define EXEC_BUMP(func) (func(),++count)
typedef void(*exec_func)(void);
inline void exec_bump(exec_func f){
f();
++count;
}
int main(void)
{
int count=0;
while(count++<10){
EXEC_BUMP(g);
//exec_bump(g);
}
return 0;
}
The differences among the cases are defining a local variable or not, and using inline function vs. macro. Why the code above give different output? Besides, is there anyone can let me know why using the inline function are more efficient than macro.
Output below:
Case 1:
Called g, count=1.
Called g, count=3.
Called g, count=5.
Called g, count=7.
Called g, count=9.
Case 2:
Called g, count=1.
Called g, count=3.
Called g, count=5.
Called g, count=7.
Called g, count=9.
Case 3:
Called g, count=0.
Called g, count=1.
Called g, count=2.
Called g, count=3.
Called g, count=4.
Called g, count=5.
Called g, count=6.
Called g, count=7.
Called g, count=8.
Called g, count=9.
Case 4:
Called g, count=0.
Called g, count=0.
Called g, count=0.
Called g, count=0.
Called g, count=0.
Upvotes: 2
Views: 1641
Reputation:
I think your test is kind of comparing apples and oranges, especially case 3 and 4. Your macro is incrementing the local count
variable, and your inline function is incrementing the global count
variable.
You named them both count
, but the macro increments the one within the local scope. It's something you have to watch out for when using macros, since they have no concept of scope.
I would suggest naming that global variable something different from the local counter to avoid the confusion.
With macros, they're kind of like a brute force 'copy and paste' code generating mechanism with a bit of text substitution. So when you define a macro like:
#define FOO (++count)
... and invoke that, it's like literally writing ++count
into the function from which you're invoking it. Macro expansion always forcefully inlines code in this sense, and it does so before the compiler and linker even get to it. The preprocessor is a completely separate stage in the build process, and so macros don't respect the same language rules for scoping that an inline function would. We also tend to have to be a bit more careful with them for this reason, since running into a case like your test can be really confusing to debug.
Function inlining, on the other hand, is actually a compile-time (and sometimes even link-time) optimization. The resulting object code or even final binary is made devoid of the normal function calling overhead, of pushing things to the stack, potentially having to pass specific things through exact registers (depending on the calling convention), etc. So you can get a similar kind of performance benefit as macro expansion, but inline functions respect the language's scoping rules.
It's also worth noting that inline
is more of a hint than anything else, and some compilers just outright ignore it and treat it only as a specifier for internal linkage. You can actually get something faster in a real world scenario if you or the optimizer choose not to inline something, as it might help reduce instruction cache misses, for example, or help the optimizer allocate registers for a more common case branch of execution (sometimes inter-procedural optimization actually interferes by reducing your code to a flat playing field, when you might want the optimization to be a bit more skewed towards a common case branch of code, and excessive inlining can actually seem to interfere with this and give worse results).
Macros do not provide such a luxury for the compiler or linker to selectively inline things, but they're a very different kind of code-generating tool. Macros can be used to generate new functions, for example, while inline functions cannot.
Upvotes: 3
Reputation: 16017
A good start when you want to think about inline functions is, perhaps surprisingly, to ignore the inline.
The semantics (the meaning) of a function do not change one bit when it is declared inline. It is just a hint to the compiler to optimize an implementation detail, and the compiler is not forced to honor it. On the other hand the compiler is free to inline functions which are not declared inline, and most modern compilers will do it when executed with optimization options.
The meaning of preprocessor macros, on the other hand, is simple and stupid. It is pure text replacement, happening before the translation phase. All identifiers occuring in the resulting C program text mean whatever they mean at that place in the code.
Upvotes: 2