emandret
emandret

Reputation: 1399

VLA prototype and multidimensional array argument

I created a C99 VLA function as such :

void create_polygon(int n, int faces[][n]);

I want to call this function in another function where I would allocate my two-dimensional array :

void parse_faces()
{
    int faces[3][6];

    create_polygon(6, faces);
}

When I pass a two-dimensional array as an argument, it passes a pointer to a 6 integer array, referencing the stack memory in the calling function.

The VLA argument here only acts as a type declaration (not allocating any actual memory), telling the compiler to access the data in row-major order with ((int*)faces)[i * 6 + j] instead of faces[i][j].

What is the difference between declaring functions with a VLA argument or with a fixed size ?

Upvotes: 2

Views: 364

Answers (2)

Aconcagua
Aconcagua

Reputation: 25536

faces[i][j] always is equivalent to *(*(faces + i) + j), no matter if VLA or not.

Now let's compare two variants (not considering that you actually need the outer dimension as well to prevent exceeding array bounds on iterating):

void create_polygon1(int faces[][6]);
void create_polygon2(int n, int faces[][n]);

It doesn't matter if array passed to originally were created as classic array or as VLA, first function accepts arrays of length of exactly 6, second can accept arbitrary length array (assuming this being clear so far...).

faces[i][j] will now be translated to:

*((int*)faces + (i * 6 + j)) // (1)
*((int*)faces + (i * n + j)) // (2)

Difference yet looks marginal, but might get more obvious on assembler level (assuming all variables are yet stored on stack; assuming sizeof(int) == 4):

LD     R1, i;
LD     R2, j;
MUL    R1, R1, 24; // using a constant! 24: 6 * sizeof(int)!
MUL    R2, R2, 4;  // sizeof(int)
ADD    R1, R2, R2; // index stored in R1 register

LD     R1, i;
LD     R2, j;
LD     R3, m;      // need to load from stack
MUL    R3, R3, 4;  // need to multiply with sizeof(int) yet     
MUL    R1, R1, R3; // can now use m from register R3
MUL    R2, R2, 4;  // ...
ADD    R1, R2, R2; // ...

True assembler code might vary, of course, especially if you use a calling convention that allows passing some parameters in registers (then loading n into into R3 might be unnecessary).


For completeness (added due to comments, unrelated to original question):
There's yet the int* array[] case: Representation by array of pointers to arrays.

*((int*)faces + (i * ??? + j))

doesn't work any more, as faces in this case is no contiguous memory (well, the pointers themselves are in contiguous memory, of course, but not all the faces[i][j]). We are forced to do:

*(*(faces + i) + j)

as we need to dereference the true pointer in the array before we can apply the next index. Assembler code for (for comparison, need a more complete variant of the pointer to 2D-array case first):

LD     R1, faces;
LD     R2, i;
LD     R3, j;
LD     R4, m;      // or skip, if no VLA
MUL    R4, R4, 4;  // or skip, if no VLA
MUL    R2, R2, R3; // constant instead of R3, if no VLA
MUL    R3, R3, 4;
ADD    R2, R2, R3; // index stored in R1 register
ADD    R1, R1, R2; // offset from base pointer
LD     R1, [R1];   // loading value of faces[i][j] into register

LD     R1, faces;
LD     R2, i;
LD     R3, j;
MUL    R2, R2, 8;  // sizeof(void*) (any pointer)
MUL    R3, R3, 4;  // sizeof(int)
ADD    R1, R1, R2; // address of faces[i]
LD     R1, [R1];   // now need to load address - i. e. de-referencing faces[i]
ADD    R1, R1, R3; // offset within array
LD     R1, [R1];   // loading value of faces[i][j] into register

Upvotes: 2

emandret
emandret

Reputation: 1399

I disassembled this code :

void    create_polygon(int n, int faces[][6])
{
    int a = sizeof(faces[0]);
    (void)a;
}

With VLA argument :

movl    %edi, -4(%rbp)   # 6
movq    %rsi, -16(%rbp)  # faces
movl    %edi, %esi
shlq    $2, %rsi         # 6 << 2 = 24
movl    %esi, %edi

With fixed size :

movl    %edi, -4(%rbp)
movq    %rsi, -16(%rbp)
movl    $24, %edi        # 24

As Aconcagua pointed out, in the first example using a VLA, the size is computed at run time by multiplying the size of an int by the size of the second dimension, which is the argument stored in rsi, then moved into edi.

In the second example, the size is directly computed at compile time and placed into edi. The main advantage being the ability to check an incorrect pointer type argument if passing a different size, thus avoiding a crash.

Upvotes: 0

Related Questions