IrAM
IrAM

Reputation: 1738

memory for array initialization with string literal

I was going through this QA where it is said that char array when initialized with string literal will cause two memory allocations one for variable and other for string literal.

I have written below program to see how is the memory allocated.

#include <stdio.h>
#include <string.h>

int main()
{
    char a[] = "123454321";
    
    printf("a =%p and &a = %p\n", a, &a);

    for(int i = 0; i< strlen(a); i++)
        printf("&a[%d] =%p and a[%d] = %c\n",i,&a[i],i,a[i]);
    
    return 0;
}

and the output is:

a =0x7ffdae87858e and &a = 0x7ffdae87858e                                                                             
&a[0] =0x7ffdae87858e and a[0] = 1                                                                                    
&a[1] =0x7ffdae87858f and a[1] = 2                                                                                    
&a[2] =0x7ffdae878590 and a[2] = 3                                                                                    
&a[3] =0x7ffdae878591 and a[3] = 4                                                                                    
&a[4] =0x7ffdae878592 and a[4] = 5                                                                                    
&a[5] =0x7ffdae878593 and a[5] = 4                                                                                    
&a[6] =0x7ffdae878594 and a[6] = 3                                                                                    
&a[7] =0x7ffdae878595 and a[7] = 2                                                                                    
&a[8] =0x7ffdae878596 and a[8] = 1

From the output it does not look like we have two separate memory locations for array and string literal.

If we have separate memory for array and string literal, is there any way we can prove array a and string literal stores separately in this scenario?

link to clone: https://onlinegdb.com/HkJhdSHyd

Upvotes: 1

Views: 589

Answers (5)

Mac
Mac

Reputation: 357

Yes, your string is stored in two places - one place in the pre initialized data section of the executable, and then as your program runs, it is copied into a second place in the working memory section of the program.

Note - on a linux system you can run the program strings on the binary to ferret out any strings tucked away in the code.

% cc myprogram.c -o myprogram
% strings myprogram

The memory layout of your executable is well defined by the compiler build system, and is indicated to the operating system by the use of a code number at the beginning (called a Magic Number, by the way).

A typical layout has:

  • Text segment (i.e. instructions)
  • Initialized data segment
  • Uninitialized data segment (bss)
  • Heap
  • Stack

Your programming code (c, c++, fortran...) is converted to assembly language appropriate for the machine you intend to run the code and stored in the Text section. Data whose value is known at compile time (such as the "123454321") is allocated an address, and is stored in the Initialized data segment by the compiler. Data that will be created during execution is allocated an address in the Uninitialized data segment, and will be initialized to 0 by the operating system's exec function when it starts your program. A variable called Block Starting Symbol (BSS) is assigned to the address where this uninitialized data starts, and the o/s uses that to know where to start writing zeros. Next, the Heap is where dynamic memory allocation (malloc) gets memory; and the stack is where the state of your program is saved when the program calls a sub routine to do work for the calling function.

See pages like https://www.geeksforgeeks.org/memory-layout-of-c-program/ (if it still exists!) for deeper explanation.

Upvotes: 1

You've completely misunderstood the question and answer. The question was about whether the initializer string consumes memory in addition to the actual array. Now the thing is, you cannot observe the initializer string.

It is like there are two sheets of paper. One in the closet with 123454321 written with ballpoint pen. One on the desk - initially empty. Then someone else comes, takes the sheet from the closet, reads the text on it, and writes it on the sheet on the desk using a pencil. Then puts the paper back into closet.

Now you're looking at that sheet on desk saying: "clearly the text 123454321 has not been written twice onto this sheet, hence what do they say about there being two copies?"

Upvotes: 1

dxiv
dxiv

Reputation: 17638

char a[] = "123454321";

Technically, the string literal "123454321" is not required to be stored anywhere as such. All that's required is that a[] be initialized with the right values when main is entered. Whether that's done by copying the string from some static read-only memory location, or running code that fills it in some other way is not mandated by the standard.

As far as the standard goes, it would be perfectly acceptable for the compiler to emit code equivalent to the following in order to initialize a[]:

char a[10];
for(int n = 0; n <= 4; n++)
    a[n] = a[8-n] = '1' + n;
a[9] = '\0';

In fact, at least one compiler (gcc) initializes a[] via custom code, rather than storing and copying the literal string.

mov     DWORD PTR [ebp-22], 875770417    ; =  0x34333231  =  '1', '2', '3', '4'
mov     DWORD PTR [ebp-18], 842216501    ; =  0x32333435  =  '5`, '4', '3', '2'
mov     WORD  PTR [ebp-14], 49           ; =  0x31        =  '1', '\0'

Upvotes: 4

David Schwartz
David Schwartz

Reputation: 182769

You can prove it by modifying the code as follows:

int main()
{
    for (int i = 0; i < 2; ++i)
    {
        char a[] = "123454321";

        printf("a = %s\n", a);
        a[3] = 'x';
        a[5] = 'y';
        printf("a = %s\n", a);
    }
}

Output:

a = 123454321
a = 123x5y321
a = 123454321
a = 123x5y321

We got the original string back after modifying it, so the original string must have been stored somewhere other than the place we modified.

Upvotes: 1

Serve Laurijssen
Serve Laurijssen

Reputation: 9753

You cant prove there's two storages because you have only one.

The compiler sees you want a char array initialized with some characters and '\0' so it does that. It does not need to store the string literal somewhere else.

This would not compile for that reason.

#include <stdio.h>
#include <string.h>

char *p = "123454321";

int main()
{
    char a[] = p;
    
    printf("a =%p and &a = %p\n", a, &a);

    for(int i = 0; i< strlen(a); i++)
        printf("&a[%d] =%p and a[%d] = %c\n",i,&a[i],i,a[i]);
    
    return 0;
}

Upvotes: 0

Related Questions