Reputation: 41
I am trying to understand how variable length arguments work in C.
Basically when a variable length argument function(ex: printf(const char *format, ...);) is called, where the arguments are copied (stack/register?)? and how the called function gets the information about the arguments passed by calling function?
I highly appreciate any form of help. Thanks in advance.
Upvotes: 4
Views: 4092
Reputation: 4877
The use of variable arguments list is a standard feature of 'C' language, and as such must be enforced on any machine for which exist a C compiler.
When we say any machine we mean that independently from the way used for parameters passing, registers, stack or both, we must have the feature.
In effect what is really needed to implement the functionality is the deterministic nature of the process. It is not relevant if parameters are passed in stack, register, both, or other MCU custom ways, what is important is that the way it is done is well defined and always the same.
If this property is respected we are sure that we can always walk the parameters list, and access each of them.
Actually the method used to pass parameters for each machine or system, is specified in the ABI (Application Binary Interface, see https://en.wikipedia.org/wiki/Application_binary_interface), following the rules, in reverse, you can always backtrack parameters.
Anyway on some system, the vast majority, the simple reverse engineering of the ABI isn't sufficient to recover parameters, i.e. parameter sizes different from standard CPU register/stack size, in this case you need more info about the parameter you are looking for: the operand size.
Let review the variable parameter handling in C. First you declare a function having a single parameter of type integer, holding the count of parameters passed as variable arguments, and the 3 dots for variable part:
int foo(int cnt, ...);
To access variable arguments normally you use the definitions in <stdarg.h>
header in the following way:
int foo(int cnt, ...)
{
va_list ap; //pointer used to iterate through parameters
int i, val;
va_start(ap, cnt); //Initialize pointer to the last known parameter
for (i=0; i<cnt; i++)
{
val = va_arg(ap, int); //Retrieve next parameter using pointer and size
printf("%d ", val); // Print parameter, an integer
}
va_end(ap); //Release pointer. Normally do_nothing
putchar('\n');
}
On a stack based machine (i.e. x86-32bits) where the parameters are pushed sequentially the code above works more or less as the following:
int foo(int cnt, ...)
{
char *ap; //pointer used to iterate through parameters
int i, val;
ap = &cnt; //Initialize pointer to the last known parameter
for (i=0; i<cnt; i++)
{
/*
* We are going to update pointer to next parameter on the stack.
* Please note that here we simply add int size to pointer because
* normally the stack word size is the same of natural integer for
* that machine, but if we are using different type we **must**
* adjust pointer to the correct stack bound by rounding to the
* larger multiply size.
*/
ap = (ap + sizeof(int));
val = *((int *)ap); //Retrieve next parameter using pointer and size
printf("%d ", val); // Print parameter, an integer
}
putchar('\n');
}
Please note that if we access types different from int
e/o having size different from native stack word size, the pointer must be adjusted to always increase of a multiple of stack word size.
Now consider a machine that use registers to pass parameters, for simplicity we consider that no operand could be larger than a register size, and that the allocation is made using the registers sequentially (also note the pseudo assembler instruction mov val, rx
that loads the variable val
with contents of register rx
):
int foo(int cnt, ...)
{
int ap; //pointer used to iterate through parameters
int i, val;
/*
* Initialize pointer to the last known parameter, in our
* case the first in the list (see after why)
*/
ap = 1;
for (i=0; i<cnt; i++)
{
/*
* Retrieve next parameter
* The code below obviously isn't real code, but should give the idea.
*/
ap++; //Next parameter
switch(ap)
{
case 1:
__asm mov val, r1; //Get value from register
break;
case 2:
__asm mov val, r2;
break;
case 3:
__asm mov val, r3;
break;
.....
case n:
__asm mov val, rn;
break;
}
printf("%d ", val); // Print parameter, an integer
}
putchar('\n');
}
Hope the concept is clear enough now.
Upvotes: 5
Reputation: 41
As extracted from ABI document, The method to store all the arguments is provided by the ABI document of an architecture.
Reference Link: https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf (page number 56).
The Register Save Area: The prologue of a function taking a variable argument list and known to call the macro va_start is expected to save the argument registers to the register save area. Each argument register has a fixed offset in the register save area.
Upvotes: 0
Reputation: 5635
Traditionally, the arguments were "always" push on the stack, regardless of other register passing optimisations, and then va_list was basically just a pointer into the stack to identify the next argument to va_arg. However, register passing is so favoured on new processors and compiler optimisation settings, that even varargs are put as registers.
With this, va_list
becomes a small data structure (or a pointer to that data structure) which captures all those register arguments, /and/ has a pointer into the stack, if the number of arguments are too many. The va_arg
macro first steps through the captured registers, then steps through the stack entries, so va_list
also has a "current index".
Note that at least in the gcc implementation va_list
is a hybrid object: When declared in the body it is an instance of the structure, but when passed as an argument, it magically becomes a pointer, like a C++ reference even though C doesn't have the concept of references.
In some platforms va_list
also allocates some dynamic memory, which is why you should always call va_end
.
Upvotes: 2
Reputation: 68089
C h\s the standard mechanisms to access those parameters. Macros are defined in the stdarg.h
http://www.cse.unt.edu/~donr/courses/4410/NOTES/stdarg/
here you have a very simple implementation of the sniprintf
int ts_formatstring(char *buf, size_t maxlen, const char *fmt, va_list va)
{
char *start_buf = buf;
maxlen--;
while(*fmt && maxlen)
{
/* Character needs formating? */
if (*fmt == '%')
{
switch (*(++fmt))
{
case 'c':
*buf++ = va_arg(va, int);
maxlen--;
break;
case 'd':
case 'i':
{
signed int val = va_arg(va, signed int);
if (val < 0)
{
val *= -1;
*buf++ = '-';
maxlen--;
}
maxlen = ts_itoa(&buf, val, 10, maxlen);
}
break;
case 's':
{
char * arg = va_arg(va, char *);
while (*arg && maxlen)
{
*buf++ = *arg++;
maxlen--;
}
}
break;
case 'u':
maxlen = ts_itoa(&buf, va_arg(va, unsigned int), 10, maxlen);
break;
case 'x':
case 'X':
maxlen = ts_itoa(&buf, va_arg(va, int), 16, maxlen);
break;
case '%':
*buf++ = '%';
maxlen--;
break;
}
fmt++;
}
/* Else just copy */
else
{
*buf++ = *fmt++;
maxlen--;
}
}
*buf = 0;
return (int)(buf - start_buf);
}
int sniprintf(char *buf, size_t maxlen, const char *fmt, ...)
{
int length;
va_list va;
va_start(va, fmt);
length = ts_formatstring(buf, maxlen, fmt, va);
va_end(va);
return length;
}
It is from the atollic studio tiny printf.
All the mechanisms (including the passing the list of the parameters to another functions are shown here.
Upvotes: -1
Reputation: 409482
Most implementations push the arguments on the stack, using register won't work well on register-starved architectures or if there's more arguments than registers generally.
And the called function doesn't know anything at all about the arguments, their count or their types. That's why e.g. printf
and related functions use format specifiers. The called function will then interpret the next part of the stack according to that format specifier (using the va_arg
"function").
If the type fetched by va_arg
doesn't match the actual type of the argument, you will have undefined behavior.
Upvotes: 0
Reputation: 41542
where the arguments are copied (stack/register?)?
It varies. On x64 normal conventions are used: the first few arguments (depending on type) probably go into registers, and other arguments go onto the stack. The C standard requires that the compiler support at least 127 arguments to a function, so it's inevitable that some of them are going to go on the stack.
how the called function gets the information about the arguments passed by calling function?
By using the initial arguments, such as the printf format string. The varargs support facilities in C doesn't allow the function to inspect the number and types of arguments, only to get them one at a time (and if they're improperly casted, or if more arguments are accessed than were passed, the result is undefined behavior).
Upvotes: 0