Reputation: 163
I have the following string :
GET /index.html HTTP/1.0;;User-Agent: Wget/1.11.4;;Accept: */*;;Host: www.google.com;;Connection
I use the following code to parse each element:
while (parser != NULL){
printf ("%s\n",parser);
parser = strtok (NULL, ";;");
}
This outputs:
GET /index.html HTTP/1.0
User-Agent: Wget/1.11.4
Accept: */*
Host: www.google.com
Connection
Now I only need to get host web address which in this case is www.google.com. So first I want to separate it from other stuff.
To do that I put another parser inside my previous one like so:
while (parser != NULL){
char * pars = strtok (string,":");
while (pars != NULL) {
printf("%s\n", pars);
pars = strtok (NULL, ":");
}
parser = strtok (NULL, ";;");
}
The output of this is some messed up stuff. I do not understand why... Can anyone see mistake? Thanks
Upvotes: 1
Views: 89
Reputation: 46365
There is a big problem with your approach - apart from the issue of strtok
not being re-entrant. That is that strtok
looks for a "match with any token" - so strtok(NULL, ";;")
will stop at the first ;
, not at the first ;;
.
I would go about this a different way - you are looking for a specific string ("\nHost: "
) - search for that, then find the bit that follows. This seems like a more robust solution.
Also note that strtok
modifies its argument - basically it will add '\0'
where it finds the token, so you will not be able to re-use the string after it was manipulated by strtok
. If you want to use the string afterwards, you need to make a copy first.
All of which suggests that you want to re-think your parsing strategy. How about
char *inputString = "GET /index.html HTTP/1.0;;User-Agent: Wget/1.11.4;;Accept: /;;Host: www.google.com;;Connection"; char *temp, *hostString, *endHost; temp = strstr(inputString, ";;Host:") + 7; // point right after "Host:" endHost = strstr(temp, ";;"); nChar = (int)(endHost - temp) + 1; hostString = malloc(nChar); strcpy(hostString, temp, nChar);
This is just to find / extract the host string.
Upvotes: 0
Reputation: 726539
The reason your code does not work is that strtok
is non-reentrant. Because the function uses static variables to save the state (this is what lets you call strtok
with NULL
as the first parameter) you cannot set up calls of strtok
in nested loops: once you tell strtok
to parse with ":"
delimiter, it "forgets" the state of parsing with the ";"
delimiter.
Switching to re-entrant version of strtok
- strtok_r
, will fix this problem. This function requires you to supply an extra parameter, savePtr
. Important: you need to supply two different variables for your savePtr
for strtok_r
in the inner and the outer loops, otherwise the code would exhibit the same behavior.
Note: strtok_r
is not part of C standard. However, most popular C libraries make it available. In case your library does not have strtok_r
, locate source code for it, and add it to your own code base.
Upvotes: 4