RiseWithMoon
RiseWithMoon

Reputation: 104

Why does strtok() not tokenize the string a certain way?

I'm trying to tokenize a string using brackets [] as delimiters. I can tokenize a string exactly how I want it with one input, but it has an error other times. For example, I have a string with characters before the delimiter and it works fine, but if nothing is before the delimiter then I run into errors.

This one gives me an error. The token2 ends up being NULL and token is "name]" with the bracket still on there.

char name[] = "[name]";
char *token = strtok(name, "[");
char *token2 = strtok(NULL, "]");

Output:

token = name]
token2 = NULL

However, if I have the following, then it works just fine.

char line[] = "Hello [name]";
char *tok = strtok(line, "[");
char *tok2 = strtok(NULL, "]");

Output:

tok = Hello
tok2 = name

I don't understand what I'm doing wrong when the input is simply something like "[name]". I want just what's inside the brackets only.

Edit: Thanks for the input, everyone. I found a solution to what I'm trying to do. Per @Ryan and @StoryTeller's advice, I first checked if the input began with [ and delimited with []. Here's what I tried and worked for the input:

char name[] = "[name]", *token = NULL, *token2 = NULL;

if (name[0] == '[')
{
    token = strtok(name, "[]");
}
else
{
    token = strtok(name, "[");
    token2 = strtok(NULL, "]");
}

Upvotes: 1

Views: 1435

Answers (2)

David C. Rankin
David C. Rankin

Reputation: 84561

If you are simply trying to extract the contents between a single-pair of brackets [...], then strchr provides a bit more straight-forward way to accomplish the task. When you are calling strtok with a single-delimiter (e.g. '[' and then ']'), you are essentially doing what you would do with two successive calls to strchr with the characters being the same '[' and then ']'.

For example the following will parse the string given on the command line for the characters between brackets ("[some name]" by default if no argument is given) up to a maximum of MAXNM character (including the nul-terminating character):

#include <stdio.h>
#include <string.h>

#define MAXNM 128

int main (int argc, char **argv) {

    char *s = argc > 1 ? argv[1] : "[some name]",   /* input */
        *p = s,                 /* pointer */
        *ep,                    /* end pointer */
        buf[MAXNM] = "";        /* buffer for result */

    /* if starting and ending bracket are present in input */
    if ((p = strchr (s, '[')) && (ep = strchr (p, ']'))) {
        if (ep - p > MAXNM) {   /* length + 1 > MAXNM ? */
            fprintf (stderr, "error: result too long.\n");
            return 1;
        }
        /* copy betweeen brackets to buf (+1 for char after `[`) */
        strncpy (buf, p + 1, ep - p - 1);  /* ep - p - 1 for length */
        buf[ep - p - 1] = 0;    /* nul terminate, also done via initialization */
        printf ("name : '%s'\n", buf);  /* output the name */
    }
    else
        fprintf (stderr, "error: no enclosing brackets found in input.\n");

    return 0;
}

note: the benefit of using strchr and strncpy for paring between fixed delimiters is you do not modify the original string (like strtok does). So this method is safe for use with string literals or other constant strings.

Example Use/Output

$ ./bin/brackets
name : 'some name'

$ ./bin/brackets "this [is the name] and more"
name : 'is the name'

Upvotes: 0

CIsForCookies
CIsForCookies

Reputation: 12817

In short: the 2nd time you called strtok() in your first example is the same as calling it on an empty string and this is why you get NULL.

Each call to strtok gives you the token based on your chosen delimiter. In your 1st try:

char name[] = "[name]";
char *token = strtok(name, "[");
char *token2 = strtok(NULL, "]");

The delimiter you chose is "[" so the 1st call to strtok will get "name]", since this is the first token in the string (remember that the string starts with a delimiter). The second will get NULL, since "name]" was the end of your original string and invoking strotk() now is like invoking it on an empty string.

strtok() uses a static buffer that holds your original string and each invocation "uses" another part of that buffer. After your 1st call, the function "used" the entire buffer.

In your 2nd try:

char line[] = "Hello [name]";
char *tok = strtok(line, "[");
char *tok2 = strtok(NULL, "]");

You call strtok on a string with the delimiter in the middle of it, so you get a token AND you still have a string left in the static buffer used by the function. That enables the 2nd call of strtok() to return a valid token instead of NULL.

Upvotes: 1

Related Questions