ty yang
ty yang

Reputation: 11

Why can’t I use strtok directly on a string in C, and why do I need to copy it first?

I don't understand why I can't directly strtok(argv[1], ";")

(would there be a difference if the argv[1] is indeed an input from terminal with argv[1] is actually a list on heap?)

    char *multiDecimalToBinary(char *argv[], int count) {
         int maxLen = 80 * (count + 1); // Max memory needed for result in this situation
        char *result = malloc(maxLen * sizeof(char));
        if (result == NULL) {
            return NULL;
        }
        result[0] = '\0'; // Initialize the result string


    /// why I need to copy argv[1] here?
    /// I think if run through terminal, argv might be an static val,
    /// but during the test, I pass in char *test = malloc(...)
        char *s = malloc((strlen(argv[1]) + 1) * sizeof(char));
        strcpy(s, argv[1]);
        char *type = strtok(s, ";");

    //    printf("%s\n",argv[1]);
    //    char *type = strtok(argv[1], ";");
    //    printf("here\n");
    void testMultiDecimalToBinary(void) {
        char **test = malloc(5 * sizeof(char *));

        test[1] = "{char;int;unsigned char}";
        test[2] = "7";
        test[3] = "10000000";
        test[4] = "255";
        assert(strcmp(multiDecimalToBinary(test, 2),
        "0000 0111 0000 0000 1001 1000 1001 0110 1000 0000 1111 1111") == 0);

        free(test);
    }

I thought both of this two method should work, but only one can

Upvotes: -2

Views: 119

Answers (2)

H.S.
H.S.

Reputation: 12679

would there be a difference if the argv[1] is indeed an input from terminal with argv[1] is actually a list on heap?

I believe, your question is w.r.t. content of argv is modifiable or not when it hold the arguments passed to program from terminal.

From C18 standard#5.1.2.2.1:

— The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program, and retain their last-stored values between program startup and program termination.

Strings passed as argument('s) to a program, i.e. input from terminal, are modifiable.
You can safely pass the members of argv parameter of main() function (second parameter of main(), which is char * array and generally referred as argv) to strtok() function.

Now, coming to your code:
The strings assigned to members of test array are non modifiable:

    test[1] = "{char;int;unsigned char}";
    test[2] = "7";
    test[3] = "10000000";
    test[4] = "255";

From C18 standard#6.4.5 String literals

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

If you pass string literal to strtok() function, your program will have undefined behavior because strtok() function modify the input string by replacing the characters of delimiter string with \0 character, if there are any.

In your program members of test array members (i.e. test[1], test[2], test[3] and test[4]) are pointing to string literal and, instead of passing them directly to strtok() function, rather it's making a copy them, which is allowed to modify, and passing to strtok() function.

Upvotes: 2

0___________
0___________

Reputation: 67835

strtok modifies the string - the string has to be modifiable. It it is not (for example if you pass string literal) then you will invoke Undefined Behaviour.

Even if the string can be modified, sometimes you might want to preserve the original string. In this case you also need to work on the deep string copy.

I don't understand why I can't directly strtok(argv[1], ";")

You can if the above conditions are met - ie, the string is modifiable and you do not care if it will be modified.

You can change your test to pass char arrays instead of string literals and it will work without creating copies.

    test[1] = (char[]){"{char;int;unsigned char}"};
    test[2] = (char[]){"7"};
    test[3] = (char[]){"10000000"};
    test[4] = (char[]){"255"};

Upvotes: 1

Related Questions