Reputation: 1490
I have a file, I want to read each line, tokenize it by tabs and store into an array. But it turns out that token[0]..token[4] are pointing to addresses of each char that results from strtok(). So token[0]...token[4] change each time I call strtok on the next line of the file. How do I correct this? If I try char tokens[MAX_SIZE]
instead of char* tokens[MAX_SIZE]
, an error of conversion occurs because strtok returns char *.
The file is
20 34 90 10 77
80 12 37 29 63
45 21 55 18 46
My code is:
FILE *f;
if ((f = fopen("myinput.txt","r")) == NULL) {
perror("Failed to open file:");
return -1;
}
char * line;
size_t len = 0;
char *tokens[MAX_SIZE];
int i = 0;
while (getline(&line, &len, f) !=-1) {
char* lineWithoutNullByte = strtok(line,"\n");
tokens[i]=strtok(lineWithoutNullByte,"\t");
i++;
int x = 1;
while (x){
tokens[i] = strtok(NULL, "\t");
if (tokens[i] == NULL){
x=0;
}else{
i++;
}
}
printf("test: %s %s %s %s %s\n", tokens[0],tokens[1],tokens[2],tokens[3],tokens[4] );
}
The expected output is
test: 20 34 90 10 77
test: 20 34 90 10 77
test: 20 34 90 10 77
But I am getting:
test: 20 34 90 10 77
test: 80 12 37 29 63
test: 45 21 55 18 46
To clarify:
This means, if I print the entire tokens
array, I will be getting
45 21 55 18 46
45 21 55 18 46
45 21 55 18 46
Upvotes: 4
Views: 3221
Reputation: 726609
You are not using tokens that you get from strtok
correctly: the tokens that you get come from the buffer returned by getline
. The first call gives you a new buffer; subsequent calls write into the same buffer, because the line fits into the allocated space.
Since you store pointers into that buffer, the next time a line with new data is placed into the old space, all tokens pointing to that address will "see" the new data. To avoid this problem, you need to copy the tokens right after taking them from strtok
, for example, by passing them to strdup
:
char *tmp = strtok(NULL, "\t");
if (tmp == NULL) {
x = 0;
tokens[i] = NULL;
} else {
i++;
tokens[i] = strdup(tmp);
}
You would need to strdup
the first token as well.
Notes: if you take this approach, you would need to free
the individual tokens once your program is done with them. You also need to free the buffer returned by getline
at the end of the outer while
loop:
free(line);
In addition, strtok
is non-reentrant, meaning that it cannot be used in concurrent environments, or even to tokenize strings in nested loops. You should use strtok_r
instead.
Upvotes: 3
Reputation: 75
You should use strtok_r instead of strtok. Because strtok just effective in the first time. I don't know reason but I have faced with this problem once time.
Upvotes: 0