Struct variable passed by value vs. passed by pointer to a function

Question

Let's say I have the following structure:

typedef struct s_tuple{
    double  x;
    double  y;
    double  z;
    double  w;
}   t_tuple;

Let's say I have the two following functions:

t_tuple tuple_sub_values(t_tuple a, t_tuple b)
{
    a.x -= b.x;
    a.y -= b.y;
    a.z -= b.z;
    a.w -= b.w;
    return (a);
}

t_tuple tuple_sub_pointers(t_tuple *a, t_tuple *b)
{
    t_tuple c;

    c.x = a->x - b->x;
    c.y = a->y - b->y;
    c.z = a->z - b->z;
    c.w = a->w - b->w;
    return (c);
}

Will there be a performance difference between the functions ? Is one of these better than the other ? Basically, what are the pros and cons of passing by value vs. passing by pointer when all of the structure elements are called ?

Edit: Completely changed my structure and functions to give a more precise example I found this post that is related to my question but is for C++: https://stackoverflow.com/questions/40185665/performance-cost-of-passing-by-value-vs-by-reference-or-by-pointer#:~:text=In%20short%3A%20It%20is%20almost,reference%20parameters%20than%20value%20parameters.

Context: My structures are not huge in this example, but I am coding a ray-tracer and some structs of size around 100B can be called millions of times so I'd like to try to optimize these calls. My structs are kind of imbricated so it would be a mess to copy them here, this is why I tried to ask my question on a kind of general example.

Petr Skocik · Accepted Answer

Getting to the core of the question: for optimal arg-passing/value-returning performance, you basically want to follow the ABI of your platform to try and make sure that things are in registers and stay in registers. If they aren't in registers and or cannot stay in registers, then passing larger-than-pointer-size data by pointer will likely save some copying (unless the copying would need to be done in the callee anyway: void pass_copy(struct large x){ use(&x); } could actually be a small bit better for codegen than void pass_copy2(struct large const*x){ struct large cpy=*x; use(&cpy); }`).

The concrete rules for e.g., the sysv x86-64 ABI are a bit complicated (see the chapter on calling conventions). But a short version might be: args/return-vals go through registers as long as their type is "simple enough" and appropriate argument passing registers are available (6 for integer vals and 6 for doubles). Structs of up to two eightbytes can go through registers (as arguments or a return value) provided they're "simple enough".

Supposing your doubles are already loaded in registers (or aren't aggregated into t_tuples that you could point the callee to), the most efficient way to pass them on x86-64 SysV ABI would be individually or via structs of two doubles each, but you'd still need to return them via memory because the ABI can only accommodate two-double retvals with registers, not 4-double retvals. If you returned a fourdouble, the compiler would stack-alloc memory in the caller, and pass a pointer to it as a hidden first argument and then return a pointer to the allocated memory (under the covers). A more flexible approach would be to not return such a large aggregate but instead explicitly pass a pointer to a struct-to-be-filled. That way the struct can be anywhere you want it (rather then auto-alloced on the stack by the compiler).

So something like

void tuple_sub_values(t_tuple *retval, 
      t_twodoubles a0, t_twodoubles a1, 
      t_twodoubles b0, t_twodoubles b1);

would a better API for avoiding memory spillage on x86-64 SysV ABI (Linux, MacOS, BSDs...).

If your measurements show the codesize savings / performance boost to be worth it for you, you could wrap it in an inline function that'd do the struct-splitting.

Struct variable passed by value vs. passed by pointer to a function

Answers (2)

Related Questions