Dracula
Dracula

Reputation: 3090

Understanding discrepancy in address assigned to array of size greater than 1 in C

I was looking at understanding how local variables are allocated memory in C. Based on this, the array will be created on the stack. And I thought the stack addressing starts from a higher address and then goes to a lower address. So say I had this:

int a; 
int arr[3];

Say a was at address 100. Then arr would be at address 96 (100 - 4), with the address for arr[3] at 88 (96 - 2 * 4), since int will take 4 bytes.

But in reality, I see something very different happening. If I make an arr of size 1 then it works as expected. But if I increase the array size then the addresses look very different.

It seems like an array of size > 1 does not go on the stack but somewhere else (heap?). Can someone explain to me the gap in addresses between a and arr for size > 1?

Size 1 arr

int main(int argc, char* argv[]) {

int a;
int arr[1];
printf("Address a: %p (%lu)\n", &a, (unsigned long)&a);
printf("Address arr[0]: %p (%lu)\n", arr, (unsigned long)arr);
printf("Address arr[1]: %p (%lu)\n", &arr[1], (unsigned long)&arr[1]);>
}

Address a: 0x7ff7b1fd60cc (140701819822284)
Address arr[0]: 0x7ff7b1fd60c8 (140701819822280)
Address arr[1]: 0x7ff7b1fd60cc (140701819822284)
Size 2 arr

int main(int argc, char* argv[]) {
int a;
int arr[2];
printf("Address a: %p (%lu)\n", &a, (unsigned long)&a);
printf("Address arr[0]: %p (%lu)\n", arr, (unsigned long)arr);
printf("Address arr[1]: %p (%lu)\n", &arr[1], (unsigned long)&arr[1]);
}

Address a: 0x7ff7b3d970bc (140701851021500)
Address arr[0]: 0x7ff7b3d970d0 (140701851021520)
Address arr[1]: 0x7ff7b3d970d4 (140701851021524)

Upvotes: 0

Views: 81

Answers (3)

Clifford
Clifford

Reputation: 93556

Whilst it is true in most architectures the stack grows from high to low memory address, and that most C implementations use stack allocation for local variables; it does not follow that later declared variables will be at lower addresses than earlier. The compiler need not allocate variables in the order declared let alone in the same order as stack growth.

The stack pointer is typically decremented for the function's entire stack frame on entry to the function. That is the stack allocation occurs for all of a function's non-static/non-register-allocated variables and return address at once. The compiler is then free to order and align those variables as it seems fit within that allocated stack frame.

Moreover often in a debug build, a compiler may add padding between variables as a means of overrun detection in the debugger.

So to demonstrate the stack growth behaviour the following would be more instructive:

void frame()
{
    static int call_depth = 0 ;
    int a;
    int arr[2];

    call_depth++ ;
    printf("Call stack depth %d: ", call_depth ) ;

    printf("Address       a:    %p\n", &a);
    printf("Address  arr[0]:    %p\n", arr);
    printf("Address  arr[1]:    %p\n\n", &arr[1]);
    
    if( call_depth < 5 )
    { 
        frame() ;
    }
}

int main()
{
    frame() ;
}

Upvotes: 0

Eric Postpischil
Eric Postpischil

Reputation: 223852

And I thought the stack addressing starts from a higher address and then goes to a lower address.

Most current ABIs1 do specify that stacks grow to lower addresses, so if routine A calls routine B, routine B’s stack frame will be at a lower address.

However, a compiler planning its layout of local variables generally does not have to put them in any order in the stack frame. Within one stack frame, the compiler may arrange data freely.

With optimization, local variables might not be stored on the stack at all; the compiler might keep their values in registers or eliminate them entirely.

If a is at address 100, the compiler is putting arr just below a, int is four bytes, the address space is “flat” (a simple numbering of bytes), and array elements are laid out at increasing addresses for increases indices, then arr will start at address 88, because arr is three elements of four bytes, so it needs 12 bytes, so it will start at 100 − 12 = 88. arr[0] will start at 88, arr[1] will start at 92, and arr[2] will start at 96.

Note that although the stack grows to lower addresses, that does not mean we put arr[0] on the stack first, then arr[1], then arr[2]. The typical plan is to put the entire array on the stack as one object. Within the array, its elements are laid out with the address order matching the index order: arr[0] is first in memory (at the lowest address), arr[1] is next (the next higher-addressed element), and arr[2] is after that.

Footnote

1 Application Binary Interface, a specification of how compiled code (binary/executable code) interacts with other compiled code.

Upvotes: 1

Marcus M&#252;ller
Marcus M&#252;ller

Reputation: 36462

int a; 
int arr[3];

Say a was at address 100. Then arr would be at address 96 (100 - 4), with the address for arr[3] at 88 (96 - 2 * 4), since int will take 4 bytes.

No. It's simple as that: C doesn't state anything about memory layout when declaring variables (that's different for fields declared within structs).

Wouldn't you be doing an operation on both a and arr that actually required them to be moved to memory (by taking their address and doing something with it, you force that they have an address!), there's no reason that they have a proper memory address at all.

Upvotes: 0

Related Questions