wadie el
wadie el

Reputation: 55

difference between a string defined by pointer or an array

I was reading about pointers in K&R book here:

https://hikage.freeshell.org/books/theCprogrammingLanguage.pdf

There is an important difference between these definitions:

 char amessage[] = "now is the time"; /* an array */

 char *pmessage = "now is the time"; /* a pointer */

amessage is an array, just big enough to hold the sequence of characters and ’\0’ that initializes it. Individual characters

within the array may be changed but amessage will always refer to the same storage. On the other hand, pmessage is a pointer, initialized to point to a string constant; the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents.

I dont understand why cwe cant modify the string content !

Upvotes: 0

Views: 636

Answers (3)

Luis Colorado
Luis Colorado

Reputation: 12668

When you initialize a pointer with a string literal, the compiler creates a read-only array (and indeed is free to merge the pointers into one if you have several initializers using the same literal string (character by character) as in:

char *a = "abcdef", *b = "abcdef";

it is probable that both pointers be initialized to the same address in memory. This is the reason by which you are not allowed to modify the string, and why the behaviour can be unpredictable (you don't know if the compiler has merged both strings) The thing goes further, as the compiler is permitted to do the following, on the next scenario:

char *a = "foo bar", *b = "bar";

the compiler is permitted to initialize a to point to a char array with the characters {'f', 'o', 'o', ' ', 'b', 'a', 'r', '\0'} and initialize also the pointer b to the fifth position of the array, as one of the string literals is a suffix of the other.

Allowing this allows the compiler to make extensive savings in the final executable and so, the string literals are assigned a read-only segment in the executable (they are placed in the .text segment or a similar one)

On the other hand, initializing an array has no problems, as you are defining the array variable that will store the characters, and it is not the compiler which is doing this. An initialization like:

char a[] = "Hello";

will arrange things to have a global variable of type array of chars with space for six characters. But you can also specify between the brackets the array size, as in

char a[32] = "Hello";

and then the array will have 32 characters (from 0 to 31) and the first five will be initialized to the character literals 'H', 'e', 'l', 'l' and 'o', followed by 27 null characters '\0'. You are also allowed to say:

char a[4] = "Hello";

but in this case you will get an array initialized as {'H', 'e', 'l', 'l'} (only the first four characters are used from the string literal, and you will get a warning from the compiler, signalling the dangerous bend)

Last, think always that an assignment and an initialization are different things, despite they use the same symbol = to indicate it, they are not the same thing. You will never be allowed to write a sentence like:

    char a[26];
    a = "foo bar";

because the expression "foo bar" represents a char * pointing to a static array (unmodifiable) and an array cannot be assigned.

Upvotes: 1

Eric Postpischil
Eric Postpischil

Reputation: 222744

I dont understand why cwe cant modify the string content !

Because the C standard says so: “If the program attempts to modify such an array [the array defined by a string literal], the behavior is undefined” (C 2018 6.4.5 7). A string literal is a sequence of characters in quotes in source code, such as "Hello, world.\n". (String literals may also be preceded by an encoding prefix u8, u, U, or L, as in L"abc".) A string literal defines an array containing the characters of the string plus a terminating null character.

A reason that attempting to modify the string literal’s array is that string literals were, and are, widely used for strings that are constant—error messages to be printed at times, format strings for printf operations, hard-coded names of things, and so on. As C developed, and the standard was written, it made sense for string literals to be treated as read-only and to allow a compiler to put them in read-only storage. Additionally, some compilers would use the same storage for identical string literals that appeared in different places, and some would use the same storage for a string literal that was a trailing substring of another string literal. Because of this shared storage, modifying one string would also modify the other. So allowing programs to modify string literals could cause some problems.

So, if you merely point to a string literal, you are pointing to something that should not be modified. If you want your own copy that can be modified, simply define it with an array as you show with char amessage[] = "now is the time";. Such a definition defines an array, amessage that has its own storage. That array is initialized with the contents of the string literal but is separate from it.

Upvotes: 3

0___________
0___________

Reputation: 67546

  1. char amessage[] = "now is the time"; /* an array */

amessage is a modifiable array of chars.

  1. char *pmessage = "now is the time"; /* a pointer */

pmessage is a pointer to the string literal. Attempt to modify the string literal is an Undefined Behaviour.

Upvotes: 1

Related Questions