In C why do I NOT need to specify 2D array size when passing into function when the 2D array is created with malloc?

Question

I'm pretty new with C and just confused with what's really happening when I'm passing 2D arrays allocated in HEAP memory into a function. I've written code which has three functions, A, B, C which demonstrates my question.

Essentially, when I create a 2d array in stack space in function-A, I am able to pass that 2d array pointer to a function-B which requires the parameter (int size, int (*arr)[size]) and that works fine. My understanding is the 'int size' variable is required to let arr pointer now how much space it should jump each increment

However, when I create a 2d array in HEAP space in function-A, passing it to function-B appears to lose the location of the data (see code). However if I pass this HEAP space 2d array to function-C which has the parameter (int **arr), it works fine.

It would be great if someone could try to explain why I don't need to specify size when passing the HEAP space 2d array into function-C. Also, when I pass the 2d array created in STACK space to function-C, it crashes, why is that?

Here is sample code showcasing my question (Output is this):

#include 
#include 

void function_A(int num)
{
    // allocating HEAP space for 2D array
    int **arrHEAP = (int **)malloc(2*sizeof(int*)); 
    arrHEAP[0] = (int *)malloc(5*sizeof(int));
    arrHEAP[1] = (int *)malloc(5*sizeof(int));
    for(int i=0;i<2;i++) // initialising
        for(int j=0;j<5;j++)
            arrHEAP[i][j] = num++;
    function_B(5, arrHEAP); // prints random data
    function_C(arrHEAP); // prints correctly, works

    // allocating STACK space for 2D array and initialising
    int arrSTACK[2][5] = {{100, 200, 300, 400, 500},{600,700,800,900,1000}};
    function_B(5, arrSTACK); // prints correctly, works
    //function_C(arrSTACK); // if I were to run this it crashes the program, why?
}
void function_B(int size, int (*arr)[size])
{
    for(int i=0;i<2;i++)
        for(int j=0;j<5;j++)
            printf("HEAP row is %d, value is %d:
", i, arr[i][j]);
}
void function_C(int **arr)
{
    for(int i=0;i<2;i++)
        for(int j=0;j<5;j++)
            printf("HEAP row is %d, value is %d:
", i, arr[i][j]);
}
int main()
{
    function_A(1);
}

David C. Rankin · Accepted Answer

Array/Pointer Conversion

The defect in understand you have surrounds the use of arrays and the use of pointers. In C, an array is a distinct type of object. One of which that causes confusion is that an array is converted to a pointer to its first element on access. (array/pointer conversion) This is governed by C11 Standard - 6.3.2.1 Other Operands - Lvalues, arrays, and function designators(p3) (note the 4 exceptions where array/pointer conversion does not occur)

The key here is type. When you declare a 2D array, e.g.

int arrSTACK[2][5] = {{100, 200, 300, 400, 500},{600,700,800,900,1000}};

On access it will be converted to a pointer -- but what type? A 2D array in C is an array of 1D arrays. Array/pointer conversion only applies to the first level of indirection. So on access arrSTACK is converted to a pointer to array int[5]. So its type is int (*)[5]. Since type controls pointer arithmetic arrSTACK + 1 advances five-integer values so that it points to the beginning of the second 1D array that makes up arrSTACK (the second row)

Pointers

int **arrHEAP declares a single pointer. A pointer-to-pointer-to int. It has nothing to do with an array. However a pointer-to-pointer can be indexed as you would index a 2D array to address the individual integers stored in memory. That is the only similarity between the 2D array and the object created by allocating storage for pointers and then allocating storage for integers and assigning the starting address for each block holding integers to one of the pointers you have allocated. Here there is no guarantee that all elements of arrHEAP are contiguous in memory as they are with a 2D array.

So let's look at the difference in how pointer arithmetic works with arrHEAP. When you dereference arrHEAP, a pointer-to-pointer (e.g. arrHEAP[0]) What type results from the dereference? If you had a pointer-to-pointer-to int and you dereference it you are left with pointer-to int. So with the array, the dereference resulted in the type pointer-to int[5], but with arrHEAP[0] the result is simply a pointer-to int (no 5 -- it's just a pointer to int). So how does pointer arithmetic differ? arrSTACK + 1 advances the pointer by 5 * sizeof(int) bytes (20-bytes). With arrHEAP + 1 advances only to the next pointer in your allocated block of pointers (1-pointer 8-bytes).

That is why you cannot pass one to the other function. The function expecting the array understands arrSTACK[0] and arrSTACK[1] being 20-bytes apart, while with the pointer arrHEAP[0] and arrHEAP[1] are only 8-bytes apart. This is the crux of the pointer-incompatibility warnings and errors you generate.

Then there is the lack of guarantee that all values of arrSTACK being sequential in memory. You know that arrSTACK[1] is always 20-byes from the beginning of the array. With arrHEAP the first allocated pointer has no guaranteed relationship with the other from an adjacency standpoint. They can later be replaced or reallocated.

What this means is if you try and provide arrSTACK to function_C(int **arr), the complier will generate a warning for incompatible pointer types -- because they are. Conversely, if you attempt to provide arrHEAP to function_B(int size, int (*arr)[size]) it will likewise issue a warning due to incompatible pointer types again -- because they are.

Even if how the object and the array are used in the other function would seem like it would work because you are essentially indexing both in the same way, the compiler cannot let one incompatible type through -- that's not the compilers job.

The compiler can only base its operation on the promise you made to it when you wrote your code. For function_B(int size, int (*arr)[size]) you promised you were sending a 2D array of 1D arrays containing 5 int. With function_C(int **arr), you promised the compiler you would provide pointer-to-pointer-to int. When the compiler sees you are attempting to pass the wrong object as a parameter, it will warn, and you should heed that warning, because the start of the 2nd block of integers in arrHEAP isn't guaranteed to be 6 int away from the beginning of arrHEAP -- and it won't be found there.

In C why do I NOT need to specify 2D array size when passing into function when the 2D array is created with malloc?

Answers (2)

Related Questions