TheRecon007
TheRecon007

Reputation: 69

strtok shifts by one place when encountering a null character

When I read data from a string with data that is separated with commas using strtok, the value that is being read is shifted if the value that was read before was (null).

int main(void)
{ 
    char Temp[10] = "1,2,3,,4";
    
    printf("%s\n", strtok(Temp, ","));
    printf("%s\n", strtok(NULL, ","));
    printf("%s\n", strtok(NULL, ","));
    printf("%s\n", strtok(NULL, ","));
    printf("%s\n", strtok(NULL, ","));
}

Expected Output:

1
2
3
(null)
4

Real Output:

1
2
3
4
(null)

The same thing happens even if there are 2 blanks. It just moves over twice, rather than once.

Upvotes: 3

Views: 109

Answers (3)

Luis Colorado
Luis Colorado

Reputation: 12708

The expected behaviour is the one you observe, and not the one you expect.

Strtok is searching for any number of characters in the provided model and not only a single char. The implementation of strtok was made to implement the way the shell separates arguments, by means of the IFS variable. Later as quoting was implemented on the shell the use of strtok remained, as it was still usefull to separate a string that had spaces, tabs or newlines. You have misinterpreted what strtok(3) does, and you need to use strsep(3) instead.

Upvotes: 0

Andreas Wenzel
Andreas Wenzel

Reputation: 25385

The function strtok does not have a notion of an empty token. If there is a sequence of more than one delimiter character, it will skip over all of them. This means that in your input, it will skip over any empty fields.

If you want to change that behavior, you will have you write your own strtok function, which is not that hard. Here is an example:

#include <stdio.h>
#include <string.h>

char *my_strtok( char *restrict str, const char *restrict delim )
{
    //this pointer will always point to the next token to
    //be processed and returned, or NULL if we have already
    //returned the last token
    static char *next;

    char *p;

    if ( str == NULL )
    {
        //check whether we have already returned the last
        //token
        if ( next == NULL )
        {
            //we already returned the last token in a
            //previous function call, so return NULL to
            //indicate that there are no more tokens
            return NULL;
        }

        //the next token is available, so process it
        str = next;
    }

    //find the next delimiter character
    p = strpbrk( str, delim );

    if ( p == NULL )
    {
        //this is the last token, so remember that there are
        //no more tokens
        next = NULL;
    }
    else
    {
        //there are more tokens available, so write a null
        //terminating character at the end of the current
        //token and remember the start of the next token
        *p = '\0';
        next = p + 1;
    }

    //return the token
    return str;
}

int main( void )
{
    char Temp[] = "1,2,3,,4";
    
    printf( "%s\n", my_strtok( Temp, "," ) );
    printf( "%s\n", my_strtok( NULL, "," ) );
    printf( "%s\n", my_strtok( NULL, "," ) );
    printf( "%s\n", my_strtok( NULL, "," ) );
    printf( "%s\n", my_strtok( NULL, "," ) );
    printf( "%s\n", my_strtok( NULL, "," ) );
}

Assuming that you are running a platform on which

printf( "%s\n", NULL );

prints (null) instead of crashing the program, then this program will have the following output:

1
2
3

4
(null)

Note that the fourth field is now empty.

This answer was partially copied from a previous answer of mine to a similar question.

Upvotes: 3

dbush
dbush

Reputation: 225047

When strtok parses a string, any consecutive delimiter characters will be grouped together. This basically means that you'll never see a blank token until you reach the end of the string.

The man page states the following:

From the above description, it follows that a sequence of two or
more contiguous delimiter bytes in the parsed string is considered to be a single delimiter, and that delimiter bytes at the start or end of the string are ignored. Put another way: the tokens returned by strtok() are always nonempty strings. Thus, for example, given the string "aaa;;bbb,", successive calls to strtok() that specify the delimiter string `;," would return the strings "aaa" and "bbb", and then a null pointer.

So what you're seeing is the expected behavior.

If you want to be able to treat consecutive delimiters as a blank token, you'll need use strchr to find each delimiter and copy out the relevant substrings yourself.

Upvotes: 5

Related Questions