Reputation: 31
As far as I know, a string literal can't be modified for example:
char* a = "abc";
a[0] = 'c';
That would not work since string literal is read-only. I can only modify it if:
char a[] = "abc";
a[0] = 'c';
However, in this post, Parse $PATH variable and save the directory names into an array of strings, the first answer modified a string literal at these two places:
path_var[j]='\0';
array[current_colon] = path_var+j+1;
I'm not very familiar with C so any explanation would be appreciated.
Upvotes: 2
Views: 1008
Reputation: 12708
There are several reasons for which you had better not to modify them:
.text
segment.char array[100] = "abc"; // initialized to { 'a' ,'b', 'c', '\0',
// /* and 96 more '\0' characters */
// };
Upvotes: 0
Reputation: 679
Code blocks from the post you linked:
const char *orig_path_var = getenv("PATH");
char *path_var = strdup(orig_path_var ? orig_path_var : "");
const char **array;
array = malloc((nb_colons+1) * sizeof(*array));
array[0] = path_var;
array[current_colon] = path_var+j+1;
First block:
getenv()
returns a pointer to a string which is pointed to by orig_path_var
. The string that get_env()
returns should be treated as a read-only string as the behaviour is undefined if the program attempts to modify it.strdup()
is called to make a duplicate of this string. The way strdup()
does this is by calling malloc()
and allocating memory for the size of the string + 1 and then copying the string into the memory.malloc()
is used, the string is stored on the heap, this allows us to edit the string and modify it.Second block:
array
points to a an array of char *
pointers. There is nb_colons+1
pointers in the array.array
is initilized to path_var
(remember it is not a string literal, but a copy of one).current_colon
th element of array
is set to path_var+j+1
. If you don't understand pointer arithmetic, this just means it assigns the address of the j+1
th char of path_var
to array[current_colon]
.As you can see, the code is not operating on const
string literals like orig_path_var
. Instead it uses a copy made with strdup()
. This seems to be where your confusion stems from so take a look at this:
char *strdup(const char *s);
The strdup() function returns a pointer to a new string which is a duplicate of the string s. Memory for the new string is obtained with malloc(3), and can be freed with free(3).
The above text shows what strdup()
does according to its man page.
It may also help to read the malloc()
man page.
Upvotes: 1
Reputation: 48052
In programming, there are quite a few rules that are up to you to follow, even though they are not — necessarily — enforced. And "String literals in C are not modifiable" is one of those. So is "Strings returned by getenv
should not be modified".
There are some real-world analogies that apply. Here's one: If you're at an intersection, and the light is red, you're not supposed to cross. But, much of the time, if you break the rule, and cross, you might get away with it. You might get a ticket from a policeman — or you might not. You might cause a crash — or you might not. But if you get lucky, and neither of these things happens, that does not imply that crossing the intersection against the red light was okay — it's still quite true that it was very much against the rules.
Similarly, in C, if you write some code that modifies a string literal, or a string returned from getenv
, you might get away with it. The compiler might give you a warning or error message — or it might not. Your program might crash — or it might not. But if the program seems to work, that does not imply that these strings are actually modifiable — they're not.
Upvotes: 2
Reputation: 58647
In the example
char* a = "abc";
the token "abc"
produces a literal object in the program image, and denotes an expression which yields that object's address.
In the example
char a[] = "abc";
The token "abc"
is serves as an array initializer, and doesn't denote a literal object. It is equivalent to:
char a[] = { 'a', 'b', 'c', 0 };
The individual character values of "abc"
are literal data is recorded somewhere and somehow in the program image, but they are not accessible as a string literal object.
The array a
isn't a literal, needless to say. Modifying a
doesn't constitute modifying a literal, because it isn't one.
Regarding the remark:
That would not work since string literal is read-only.
That isn't accurate. The ISO C standard (no version of it to date) doesn't specify any requirements for what happens if a program tries to modify a string literal. It is undefined behavior. If your implementation stops the program with some diagnostic message, that's because of undefined behavior, not because it is required.
C implementations are not required to support string literal modification, which has the benefits like:
standard-conforming C programs can be translated into images that can be be burned into ROM chips, such that their string literals are accessed directly from that ROM image without having to be copied into RAM on start-up.
compilers can condense the storage for string literals by taking advantage of situations when one literal is a suffix of another. The expression "string" + 2 == "ring"
can yield true. Since a strictly conforming program will not do something like "ring"[0] = 'w'
, due to that being undefined behavior, such a program will thereby avoid falling victim to the surprise of "string"
unexpectedly turning into "stwing"
.
Upvotes: 1