drZaius
drZaius

Reputation: 149

C segmentation fault after altering argv

I want to change the values of argv in C, but I'm getting a segmentation fault. Here's the code.

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv)
{
    for (int i = 1; argv[i]; i++)
    {
        char *val;

        printf("before: %d %s\n", i, argv[i]);
        argv[i] = "bar=foo";
        printf("after: %d %s\n", i, argv[i]);

        char *arg = argv[i];
        val = strchr(arg, '=');
        *val = '\0';
    }
    return 0;
}

I'm passing the argument foo=bar (and try to change it in line 11 to bar=foo). The output looks like this:

before: 1 foo=bar
after: 1 bar=foo

So, the modification actually takes place, but the line *val = '\0'; causes a segmentation fault.

Can somebody tell me why this is and how I can prevent it?

Upvotes: 1

Views: 368

Answers (3)

Leonid Kalichkin
Leonid Kalichkin

Reputation: 1

The segmentation fault happens because of the attempt to write to the read-only memory. In your example, you are attempting to modify a string literal "bar=foo" through val pointer. Your compiler placed this string literal in a segment in the executable file that in your execution environment was loaded to a memory region with read-only access.

The compiler could have placed this string literal in a different segment or the execution environment could have not had writing restriction to the memory region where the string literal was loaded. In that case, you would not have had a segmentation fault.

Since the C standard does not require string literals to be always placed in a read-only memory region during execution, in some cases they can be successfully modified while in others this operation is not allowed, and that is why the standard says:

(C90, 6.1.4) "If the program attempts to modify a string literal of either form, the behavior is undefined"

This situation could have been avoided by specifying -Wwrite-strings option for GCC and Clang compilers when compiling the given sample. This option makes string literals const-qualified (they are not by default), thus triggering the warning when string literals are assigned to non-const pointer variables that could be used to modify memory behind string literals that could be located in a read-only region.

warning: assignment discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
   11 |         argv[i] = "bar=foo";
      |                 ^

By also specifying -Werror or -Werror=discarded-qualifiers and treating the warning as an error you would have catched the mistake before execution because the sample would not have been compiled.

Upvotes: 0

Some programmer dude
Some programmer dude

Reputation: 409432

You make argv[i] point to a string literal, which is an array of read-only characters. Then you attempt to modify this read-only array leading to undefined behavior.

There's a reason you should be using const char * for string literals.


There are a couple of ways to solve your problem. The first and simplest, especially if you are going to use the same string for all arguments (not very likely except in a contrived examples such as the one you show) is to use an array, like char argument[] = "bar=foo";. Relying on the natural decay of arrays to pointers to their first arguments, you could use that in the assignment to argv[i] instead, and modify the array to your hearts content.

But like I said that won't really be very useful except for simple examples. That leaves us with another option, using e.g. strdup to dynamically allocate and copy the strings (alternatively if strdup is not available, it's not a standard function, you could manually use malloc and strcpy). Since you then have dynamically allocated the memory it can also be modified to your hearts content. The problem here is that you then need to keep track of the original pointers and use free to free them again.

So these are basically the two solutions you could use.One is unrealistic, and the other have dynamic memory allocations and all the problem that entails. There are really no good and especially no simple solutions to your problem.

Upvotes: 9

Iharob Al Asimi
Iharob Al Asimi

Reputation: 53016

You are writing to a string literal here

val = strchr(arg, '=');
*val = '\0';

String literals live in the read only segment of the program, and trying to alter them invokes undefined behavior.

After this line

argv[i] = "bar=foo";

argv[i] points to a string literal "bar=foo" you then, create a new pointer arg that points to argv[i] and finally val a poitner to the = in arg which you then go and try to overwrite with a nul.

Upvotes: 3

Related Questions