C Compilers -- Indirection with Multidim Arrays

By definition, in every standard of C, x[y] is equivalent to (and often compiled as) *((x)+(y)). Additionally, a name of an array is converted to an address operator to it -- so if x is an array, it would be *((&(x))+(y))

So, for a multidimension array, x as a 2 dimension array, x[y][z] would be equivalent to (((&(x))+(y))+(z))

In the small scale toy C compiler I'm working on, this fails to generate proper code, because it tries to indirectly access a pointed to address at every * instruction -- this works for single dimension arrays, but for multi dimension it results in something like (in vaguely assembly pseudocode)

load &x; add y; deref; add z; deref

Where deref is an instruction to load the value at the address of the previous calculation -- as this is how the indirection operator seems to work??

However, this will generate bad code, since we should be dealing all with a single address, only dereferencing at the very end. I'm assuming there's something in the spec I'm missing?

Upvotes: 1

Answers (3)

Vlad from Moscow

Reputation: 311088

So, for a multidimension array, x as a 2 dimension array, x[y][z] would be equivalent to (((&(x))+(y))+(z))

You are mistaken. The expression x[y][z] is evaluated like:

*( *( x + y ) + z )

Here is a demonstration program:

#include <stdio.h>

int main(void) 
{
    enum { M = 3, N = 3 };
    int a[M][N] =
    {
        { 1, 2, 3 },
        { 4, 5, 6 },
        { 7, 8, 9 }
    };
    
    for ( size_t i = 0; i < M; i++ )
    {
        for ( size_t j = 0; j < N; j++ )
        {
            printf( "%d ", *( *( a + i ) + j ) );
        }
        putchar( '\n' );
    }

    return 0;
}

Its output is:

1 2 3 
4 5 6 
7 8 9

Array designators used in expressions (with rare exceptions) are implicitly converted to pointers to their first elements.

So if you have an array declared like:

int a[M][N];

then the array designator a is converted to a pointer to its first element ("row"). The type of the array element is int[N]. So a pointer to such object has the type int ( * )[N].

If you want that a pointer point to the i-th element of the array you need to write the expression a + i. Dereferencing the expression you will get the i-th row (one-dimensional array) that in turn used in expressions is converted to a pointer to its first element.

So the expression a + i has the type int ( * )[N].

The expression *( a + i ) has the type int[N] that at once is implicitly converted to a pointer of the type int * to its firs element in the enclosing expression.

The expression *( a + i ) + j points to the j-th element of the "row" of the two-dimensional array. Dereferencing the expression *( *( a + i ) + j ) you will get the j-th element of the i-th row of the array.

Upvotes: 0

Lundin

Reputation: 214880

So, for a multidimension array, x as a 2 dimension array, x[y][z] would be equivalent to (((&(x))+(y))+(z))

No, a 2D array is an array of arrays. So *((x)+(y)) gives you that array, x decays into a pointer to the first element, which is then de-referenced to give you array number y.

This array too "decays" into a pointer of the first element, so you get:

( (*((x)+(y))) + (z) )

When part of an expression, arrays always decay into a pointer to it's first element. Except for a few exceptions, namely the & address of and sizeof operators. Why typing out the & as done in your pseudo code is just confusing.

A practical example would be:

int arr[x][y];
for(size_t i=0; i<x; i++)
  for(size_t j=0; j<y; j++)
    arr[i][j] = ...

In the expression arr[i][j], the [] is just "syntactic sugar" for pointer arithmetic (see Do pointers support "array style indexing"?).
So we get *((arr)+(i)), where arr is decayed into a pointer to the type of the first element, int(*)[y].
Pointer arithmetic on that array pointer type yields array number i of type int [y].
Again, there is array decay on this one, because it too is an array part of an expression. We get a pointer to the first element, type int*.
Pointer arithmetic of the int* + j gives the address of the integer, which is then finally de-referenced to give the actual int.

Upvotes: 0

HolyBlackCat

Reputation: 96851

name of an array is converted to an address operator to it

No. You could say that x is converted to &x[0], which has different type compared to &x.

Assuming you have T a[M][N];, doing a[x][y] does following:

a is converted to a temporary pointer of type T (*)[N], pointing to the first array element.
This pointer is incremented by x * sizeof(T[N]), i.e. by x * N * sizeof(T).
The pointer is dereferenced, giving you a value of type T[N].
The result is converted to a temporary pointer of type T *.
The pointer is incremented by y * sizeof(T).
Finally, the pointer is dereferenced to produce a value of type T.

Note that an array itself (multidimensional or not) doesn't store any pointers to itself. When converted to a pointer, the resulting pointer is calculated on the fly.

Upvotes: 2

C Compilers -- Indirection with Multidim Arrays

Answers (3)

Related Questions