jjg
jjg

Reputation: 1024

Pass arguments by value in C: how big is too big?

I'd be interested to know what seasoned C programmers think is an upper bound for the size of an argument which can be passed by value.

Context: I have occasion to work with 2×2 matrices, which I have in a struct:

typedef struct
{
  double a, b, c, d;
} mat_t;

Now it makes life a lot easier if I can pass by value, particularly for composite operations

mat_t A = mat_sum(mat_smul(lambda, B), C);

for A = λB + C, for example. At the same time I'm aware that pass-by-value involves copying things to the runtime stack so comes with a cost if those things are big.

That the C language standard library passes complex numbers by value suggests "two doubles" as a reasonable lower bound, but a reasonable upper bound?

Upvotes: 3

Views: 1279

Answers (3)

chqrlie
chqrlie

Reputation: 144951

In your example, passing the matrices by value does not cause much overhead compared to passing them by reference (ie: passing a pointer to local matrix objects) and is much more readable as you mention.

What seems important for code generation is the ability to inline these functions. Modern compilers are quite good at this, but it can help to define the functions as static inline in the header file.

Look at the code generated for this simple example:

#include <stdio.h>

typedef struct {
    double a, b, c, d;
} mat_t;

mat_t mat_sum(mat_t m1, mat_t m2) {
    return (mat_t){ m1.a + m2.a, m1.b + m2.b, m1.c + m2.c, m1.d + m2.d };
}

mat_t mat_smul(double x, mat_t m) {
    return (mat_t){ x * m.a, x * m.b, x * m.c, x * m.d };
}

mat_t *mat_sump(mat_t *res, const mat_t *m1, const mat_t *m2) {
    res->a = m1->a + m2->a;
    res->b = m1->b + m2->b;
    res->c = m1->c + m2->c;
    res->d = m1->d + m2->d;
    return res;
}

mat_t *mat_smulp(mat_t *res, double x, const mat_t *m) {
    res->a = x * m->a;
    res->b = x * m->b;
    res->c = x * m->c;
    res->d = x * m->d;
    return res;
}

void mat_print(const char *name, mat_t M) {
    printf("%s: { %g, %g, %g, %g };\n", name, M.a, M.b, M.c, M.d);
}

mat_t A = { 1, 2, 3, 4 };
mat_t B = { 4, 3, 2, 1 };
double lambda = 2.5;

int main() {
    mat_t C = mat_sum(mat_smul(lambda, A), B);
    mat_print("C", C);

    mat_t T;
    mat_sump(&C, mat_smulp(&T, lambda, &A), &B);
    mat_print("C", C);

    return 0;
}

Upvotes: 1

klutt
klutt

Reputation: 31409

An upper bound is basically the stack size. On Windows it's usually 1MB and on Linux 8MB. This can be changed with compiler flags. In practice, I'd say the upper bound is MUCH lower, but I don't have a good rule of thumb for that.

The only rule of thumb that's relevant here is that you should use pointers for structs in general.

Large structures can give a huge performance hit because of unnecessary copying.

And I cannot really see how it would make things so much easier. Because you can easily do something like this:

T foo(mat_t *A) {
    mat_t a = *A;

    // Continue as if A was not a pointer but use a instead

If you really feel that you need to copy the whole struct just to make the rest a little bit easier, use the above method, because that will make it A LOT easier when you realize that you need to refactor the code because it's too slow.

Upvotes: 2

Lundin
Lundin

Reputation: 214310

This is fairly subjective, but in general it goes like this:

Everyone generally agrees that "primitive data types" (integers, floating point, pointers etc) are fine to pass by value. Within reason, if you have more than 5 or so parameters then perhaps you should have used a struct instead.

Some programmers think structs are fine to pass by value if they only contain a few of such primitive data types. Or at least as long as you keep the struct size below the data word size of the CPU. Others are more strict and say that structs should always be passed by reference no matter how big or small, because that makes your coding style consistent, but also because there's not much in the way of performance difference when accessing parameters indirectly through a pointer compared to accessing local variables directly. Copying data to the stack is always a performance hit though, in case the function couldn't be inlined.

There's no obvious right or wrong though, since this boils down to system calling convention and stack frame format.

Everyone generally agrees that you shouldn't allocate huge arrays or structs on the stack, due to the potential for stack overflow. "Huge" might mean 100 bytes or it might mean thousands of bytes, again depending on system.

If looking at the most low-end systems with just a few general-purpose registers and very limited stack, then in some cases you have to bake all parameters into a struct and pass that one by reference, in order to speed up the function call and reduce stack use.

Upvotes: 2

Related Questions