Reputation: 91630
I'm currently doing a deep-dive into Assembly land, mainly from the perspective of x86_64, C, and System V AMD64, generally targeting Linux.
It's pretty straightforward that the calling convention for integer (and by implication, pointer) values by using the following registers in order:
Longer argument counts are handled by pushing values onto the stack frame of the subroutine. I got these register names from the Wikipedia page on x86_64 calling conventions.
For larger values like structs and arrays, the convention also seems to be to push into the callee's stack frame.
However, what is the calling convention for floating-point arguments to functions? Are floating point registers used?
Another related question: what if I have mixed argument types?
void mixed(int a, float b, mystruct c) { /* ... */ }
If my function takes an arg list like this, how do I call such a function from Assembly? Which registers are used in interleaved arg lists like this?
Upvotes: 7
Views: 3591
Reputation: 91630
The first up-to-8 float
/ double
args are passed in XMM0-7, one each in the bottom of the register like with movss
/ movsd
, regardless of how many earlier args of other types there are.
(Unlike Windows x64, where only the first 4 total args can be passed in registers.)
A float
or double
is returned in XMM0.
For example, double foo(float a, double b) { return a+b; }
compiles to (Godbolt)
foo:
cvtss2sd xmm0, xmm0 # implicit conversion of a to double
addsd xmm0, xmm1 # add, leaving the return value in XMM0
ret
Compilers are good at following the ABI, so compiling an example with optimization enabled is a good way to see which incoming registers or memory locations hold which args. Assigning values to a volatile
global can be a good way to force compilers to not optimize something away. And/or compile a caller that passes different values for each arg.
The calling convention for parameter passing is specified in the System V Application Binary Interface for AMD64PDF documentation in section 3.2.3.
I'm not sure if the documentation can be legally quoted here, but I can at least paraphrase.
First, the documentation defines eight different classifications for parameter values:
complex
floating point types.It next defines how C types fit into these classifications:
_Bool
, char
, short
, int
, long
, long long
, and pointers are classified as INTEGER and will use those registers.
float
, double
, _Decimal32
, _Decimal64
, and __m64
are classified as SSE and will use those registers.
__float128
, _Decimal128
, and __m128
are split in half, storing the least significant bytes/bits in SSE and the most significant bytes/bits in SSEUP.
__m256
is split into four 64-bit (8 byte) values, with the least significant bytes being stored as SSE and the rest as SSEUP
__m512
is similarly split into 64-bit (8 byte) chunks, with the least significant bytes stored as SSE and everything else as SSEUP
long double
values store their 64-bit mantissa as X87 and the 16-bit exponent is padded to 64-bits (8 bytes) and stored in X87UP.
__int128
is essentially stored as two long
values in INTEGER with the first half being the low bits/bytes and the second half being the high bits/bytes. They can be understood as if they were defined as a struct:
typedef struct {
long low_bits, high_bits;
} __int128;
complex double
and complex float
types are split in half, with the first half being the real component and the second half being the imaginary component, and are stored in SSE. They can be understood as if they were defined as a struct like so:
typedef struct {
double real, imaginary;
} complex_double;
complex long double
values are classified as COMPLEX_X87.
The logic for struct
s, union
s, and arrays is fairly complicated, consult the documentation linked above for more information. In a nutshell, there is a recursive algorithm defined for how to pass aggregate types that decides how values are passed.
Now that we have a classification system and a recursive algorithm for dealing with struct
s, union
s, and arrays, we apply this system and algorithm to the parameters to a function, which consists of the following steps for each argument:
%rdi
, %rsi
, %rdx
, %rcx
, %r8
, and %r9
.%xmm0
to %xmm7
.%xmm
register for SSE types.Rinse and repeat for all argument values. If you run out of registers for a given type, write to the stack.
TL;DR There is a non-trivial, but fairly straightforward algorithm defined by the System V ABI for passing different types of data.
Upvotes: 8