HelloWorld
HelloWorld

Reputation: 193

Why can I update a pointer to a (constant) string literal?

All answers are highly appreciated, and to all those devoting their time clarifying these things - thank you very much .

I'm learning C, and just finished the chapter about pointers. In this book i'm reading an example code is given that got me really confused.

Part of Example code:

...

 1  char *inp_file = "";
 2  char *out_file = "";
 3  char ch;
 4  
 5  while ( ( ch = getopt( argc, argv, "i:o:" )) != EOF )
 6  {
 7      switch( ch )
 8      {
 9          case 'i':
10              inp_file = optarg;
11              break;
12          case 'o':
13              out_file = optarg;
14              break;
15  
16          default:
17              fprintf( stderr, "Unknown option: '%s'/n", optarg );
18              return 2;
19      }
20  }
21  
22  argc -= optind;
23  argv += optind;

...

My understanding is that char *inp_file = "" and char *out_file = "" are pointers to string literals.

Where are they pointing to ? Considering it's an empty ""

How can they be updated ( line 10, 13 ) when they are stored in read-only memory ?

Is char *pointer; same as char *pointer = ""; ?


Furthermore i tried this and it worked.

#include <stdio.h>

int main( int argc, char *argv[] )
{
    char *msg = "Hello";

    msg = "World";

    printf("%s\n", msg );// Prints 'World'
}

I'm 100% sure char *msg = "Hello"; is a pointer to string literal.

Why it gets updated to 'World' when it's in read-only memory ?

Is it a complete new reassignment or what ?

I'm really confused right now on what i know about pointers. What am i missing here ?

Upvotes: 0

Views: 260

Answers (3)

aghast
aghast

Reputation: 15300

There are actually two things going on. First, there is the string literal. You created a zero-length string, "", which is still NUL-terminated, because all C strings are NUL-terminated - that's how you know where the end is.

So you have a block of memory that looks like this:

Memory loc'n:  Contents
BASE+0x0000:  # start of string
BASE+0x0000:  '\0'  # end of string

That is, a block of memory that contains "no characters", followed by a trailing NUL byte to mark the end of the string.

That data is generally considered to be "constant." It may, or may not, be stored in "constant data." This depends on the linker, OS, etc.

However, that is only the "constant string literal." There is a second part to your code:

char *inp_file = "";

You have declared a pointer to the constant string literal. That pointer is a pointer-sized object (4-bytes if you have a 32-bit address space, 8-bytes if you have a 64-bit address space, some other size if you have a different, or mixed, address space) and contains the memory address of the first byte of the constant string literal.

Memory loc'n:      Contents
PTR_BASE+0x0000:   (BASE+0x0000)
PTR_BASE+0x0008:   ...

Because you declared inp_file outside of any function, it is considered to have file scope. A file scope initialized variable is stored in a data segment (more about memory layouts here). (Note that uninitialized variables may be stored in a zero segment or uninitialized segment, depending on architecture.)

On the other hand, again depending on architecture and platform, a file scope constant may be stored in a data segment or in a text segment, either a separate constants segment or the same one containing the program code.

So you have two memory locations, possibly in different program segments. The first is the "literal string" you created, "". The second is the pointer variable you declared, inp_file. The pointer gets initialized at load time with the address of the literal string.

Once your program is running, you (might) execute code that says:

inp_file = optarg;

That causes the pointer variable to change its value. Now, instead of pointing at the literal string you first created, it points at a string determined by the getopt library. This is probably somewhere in the argv area, but it might be in a strduped block on the heap (because you don't know how getopt works, and what it might do on various systems).

Please be aware: back in the day, it was actually possible and commonplace to overwrite the "constant" strings that were used as initial values. You may find old programs that still do this. Modern C is pretty aggressive about discouraging this, but most code is legacy code. ;-)

Upvotes: 2

haccks
haccks

Reputation: 106012

My understanding is that char *inp_file = "" and char *out_file = "" are pointers to string literals.

Yes, they are.

Where are they pointing to ?

They are pointing to an empty string literal.

Is char *pointer; same as char *pointer = ""; ?

No. char *pointer; is an uninitialised pointer while char *pointer = ""; an initialised one. "" is of type const char[1] having an element '\0'.

Why it gets updated to "World" when it's in read-only memory ?

char *msg = "Hello"; is equivalent to

char const *msg = "Hello";  

It means the string literal msg points to shall not be modified but this constraint is on the string literal not the pointer pointing to string literal. msg can be modified.

Is it a complete new reassignment or what ?

msg = "World"; is an assignment of new string literal to the pointer msg.

Upvotes: 3

Mats Petersson
Mats Petersson

Reputation: 129344

You are NOT updating "hello", you are setting msg to point to a different string, "World" - it may or may not work to do strcpy(msg, "World") instead (depending on system setup, but it's definitely undefined behaviour, so DO NOT write code that does this).

To show this, you could add a printf("Before: %p\n", (void*)msg); and printf("After: %p\n", (void*)msg); on either side of your msg = "World"; line.

Upvotes: 0

Related Questions