gibraltar
gibraltar

Reputation: 1708

Strange behavior of String tokenizer in C

I have written the following program to resolve a path to several directory names

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char *
tokenizer(char *path, char **name){
  char s[300];
  char *buffer;
  memcpy(s, path, strlen(path)+1);
  printf("%s\n",s);    // PROBLEM
  int i=0;
  while(s[i] == '/'){
    i++;
  }
  if (i == strlen(path)){
    return NULL;
  }
  *name = strtok_r(s, "/", &buffer);
  return buffer;
}

int main(void){
  char str[300];
  char *token, *p;
  scanf("%s",str);
  p = tokenizer(str, &token);
  if (p != NULL)
    printf("%s\n",token);
  else
    printf("Nothing left\n");
  while((p=tokenizer(p, &token)) != NULL){
    printf("%s\n",token);
  }
}

Output of the above program

Input: a/b/c
Output: a/b/c
a/b/c
a
b/c
b
c
c

If I comment the line labelled PROBLEM

Input: a/b/c
Output: Some garbage value

Can somebody explain me the reason for this strange behavior?

Note: I have realised that s is a stack allocated variable and it ceases to exist in function main() but why does the program works when I use printf() ?

Upvotes: 1

Views: 275

Answers (5)

Jack
Jack

Reputation: 16724

Try this:

char*
token(char * path, char ** name){

    static char * obuffer = NULL;
    char * buffer = NULL, * p, * q;

    if(path == NULL) {
        buffer = realloc(buffer, strlen(obuffer) + 1);
        p = obuffer;
    } else {
        buffer = malloc(257);
        p = path;
    }

    if(!buffer) return NULL;
    q = buffer; 

    if(!p || !*p) return NULL;

    while(*p != '\0') {
          if(*p == '/') { 
            p++; /* remove the / from string. */
            break;
          }
          *q ++ = *p++;
    }

    *q ++ = '\0';
    obuffer = p;
    *name = buffer;

    return buffer;
}

int main(void)
{

    char * s = "foo/baa/hehehe/";
    char * name = NULL;
    char * t = token(s, &name);
    while(t) {
        printf("%s\n", name);
        t = token(NULL, &name);
    }

    return 0;
}

the output:

foo
baa
hehehe

But you are basically "reinventing the wheel" of strtok() function..

Upvotes: 0

Jerry Coffin
Jerry Coffin

Reputation: 490108

Along with the observations that you're returning a pointer to a local variable, I think it's worth noting that your tokenizer is almost 100% pointless.

Most of what your tokenizer does is skip across any leading / characters before calling strtok_r -- but you're passing '/' as the delimiter character to strtok_r, which will automatically skip across any leading delimiter characters on it own.

Rather simpler code suffices to print out the components of a path without the delimiters:

char path[] = "a/b/c";
char *pos = NULL;

char *component = strtok_r(path, "/", &pos);
while (NULL != component) { 
    printf("%s\n", component);
    component = strtok_r(NULL, "/", &pos);
}

Upvotes: 0

ugoren
ugoren

Reputation: 16441

In addition to what geekasaur says:

strtok_r's 3rd parameter is used incorrectly, in two ways:
1. It should be initialized to NULL before the first call.
2. It shouldn't be used in any way (you return it to the caller). It should only be passed to another strtok_r call.

Upvotes: 3

dash1e
dash1e

Reputation: 7807

You cannot do this

char s[300];
char *buffer;
...
*name = strtok_r(s, "/", &buffer);
return buffer;

Here buffer is a pointer to a s[300] position. s[300] is a function local variable allocated on the stack when the function is called and destroyed when the function returns. So you are not returning a valid pointer, you cannot use that pointer out of the function.

Upvotes: 1

geekosaur
geekosaur

Reputation: 61369

You are returning a pointer into a stack-allocated string (buffer points into s); s's memory ceases to be meaningful after tokenize returns.

Upvotes: 3

Related Questions