Reputation: 9825
I know what the differences between __cdecl
and __stdcall
are, but I'm not quite sure as to why __stdcall
is ignored by the compiler in x64 builds.
The functions in the following code
int __stdcall stdcallFunc(int a, int b, int c, int d, int e, int f, int g)
{
return a + b + c + d + e + f + g;
}
int __cdecl cdeclFunc(int a, int b, int c, int d, int e, int f, int g)
{
return a + b + c + d + e + f + g;
}
int main()
{
stdcallFunc(1, 2, 3, 4, 5, 6, 7);
cdeclFunc(1, 2, 3, 4, 5, 6, 7);
return 0;
}
have enough parameters to exceed the available CPU registers. Therefore, some arguments must be passed via the stack. I'm not fluent in assembly but I noticed some differences between x86 and x64 assembly.
x64
main PROC
$LN3:
sub rsp, 72 ; 00000048H
mov DWORD PTR [rsp+48], 7
mov DWORD PTR [rsp+40], 6
mov DWORD PTR [rsp+32], 5
mov r9d, 4
mov r8d, 3
mov edx, 2
mov ecx, 1
call ?stdcallFunc@@YAHHHHHHHH@Z ; stdcallFunc
mov DWORD PTR [rsp+48], 7
mov DWORD PTR [rsp+40], 6
mov DWORD PTR [rsp+32], 5
mov r9d, 4
mov r8d, 3
mov edx, 2
mov ecx, 1
call ?cdeclFunc@@YAHHHHHHHH@Z ; cdeclFunc
xor eax, eax
add rsp, 72 ; 00000048H
ret 0
main ENDP
x86
_main PROC
push ebp
mov ebp, esp
push 7
push 6
push 5
push 4
push 3
push 2
push 1
call ?stdcallFunc@@YGHHHHHHHH@Z ; stdcallFunc
push 7
push 6
push 5
push 4
push 3
push 2
push 1
call ?cdeclFunc@@YAHHHHHHHH@Z ; cdeclFunc
add esp, 28 ; 0000001cH
xor eax, eax
pop ebp
ret 0
_main ENDP
push
instructions. Instead we reserve enough stack space at the beginning of main
and use mov
instructions to add the arguments to the stack.call
s, but at the end of main
.This brings me to my questions:
mov
rather than push
? I assume it's just more efficient and wasn't available in x86.call
instructions in x64?__stdcall
in x64 assembly?
From the docs:
On ARM and x64 processors, __stdcall is accepted and ignored by the compiler
Here is the example code and assembly.
Upvotes: 4
Views: 5398
Reputation: 24726
- Why does x64 use
mov
rather thanpush
? I assume it's just more efficient and wasn't available in x86.
That is not the reason. Both of these instructions also exist in x86 assembly language.
The reason why your compiler is not emitting a push
instruction for the x64 code is probably because it must adjust the stack pointer directly anyway, in order to create 32 bytes of "shadow space" for the called function. See this link (which was provided by @NateEldredge) for further information on "shadow space".
Allocating 32 bytes of "shadow space" with push
instructions would take 4 64-bit push
instructions, but only one sub
instruction. That is why it prefers to use the sub
instruction. Since it is using the sub
instruction anyway to create 32 bytes of shadow space, there is no penalty to change the operand of the sub
instruction from 32 to 72, which allocates 72 bytes of memory on the stack, which is enough to also pass 3 parameters on the stack (the other 4 are passed in CPU registers).
I don't understand why it is allocating 72 bytes on the stack, though, as, according to my calculcations, it only has to be 56 bytes (32 bytes of "shadow space" and 24 bytes for the 3 parameters that are passed on the stack). Possibly, the compiler is reserving those extra 16 bytes for local variables or for exception handling, which may be optimized away when compiler optimizations are active.
- Why is there no stack cleanup after the call instructions in x64?
There is stack cleanup after the call instructions. This is what the line
add rsp, 72
does.
However, for some reason (probably increased performance), the x64 compiler only performs the cleanup at the end of the calling function, instead of after every function call. This means that with the x64 compiler, all function calls share the same stack space for their parameters, whereas with the x86 compiler, the stack space is allocated and cleaned up at every function call.
- What's the reason that Microsoft chose to ignore __stdcall in x64 assembly?
The keywords _stdcall
and _cdecl
specify 32-bit calling conventions. That's why they are not relevant for 64-bit programs (i.e. x64). On x64, there is only the standard calling convention and the extended __vectorcall
calling convenction.
Upvotes: 8