James Read
James Read

Reputation: 459

How to detect empty line with strtok

I'm trying to separate a header from a footer separated by an empty line i.e. with the string "\r\n\r\n".

I have tried using strtok with separator "\r\n\r\n" as the following code snippet shows.

#include <stdio.h>
#include <string.h>

int main() {
    char get[1024] = "HTTP/1.1 200 OK\r\nDate: Tue, 23 Jul 2019 22:52:44 GMT\r\n"
                     "Server: Apache/2.4.7 (Ubuntu)\r\n"
                     "Last-Modified: Wed, 18 Aug 2004 23:07:08 GMT\r\n"
                     "ETag: \"164-3e1f5b9a57f00\"\r\n"
                     "Accept-Ranges: bytes\r\n"
                     "Content-Length: 356\r\n"
                     "Vary: Accept-Encoding\r\n"
                     "Content-Type: text/html\r\n"
                     "\r\n"
                     "<html><head><title>Temporary Page</title></head><body><center><h1>New Account Temporary Page</h1>Welcome to yet another new account at Hurricane Electric.<p><h2>Come Back Soon!</h2>The New Account Owner Will Be Putting Something Interesting Here!</center><hr>Space Provided By <a href=\"http://www.he.net\">Hurricane Electric</a></body></html>";

    char *html;

    html = strtok(get, "\r\n\r\n");
    printf("Header:\n\n%s\n\n", html);
    html = strtok(NULL, "\r\n\r\n");
    printf("HTML:\n\n%s\n\n", html);

    return 1;
}

I expect the output to be the header printed out and then the HTML footer.

But ouptut is:

Header:

HTTP/1.1 200 OK

HTML:

Date: Tue, 23 Jul 2019 22:52:44 GMT

It seems that strtok is separating using "\r\n" rather than "\r\n\r\n". How can I fix this strange behaviour?

Upvotes: 2

Views: 1077

Answers (1)

Jean-Fran&#231;ois Fabre
Jean-Fran&#231;ois Fabre

Reputation: 140256

strtok delimiter argument is not a sequence of chars. It is the list of the possible chars that delimit your tokens.

So passing repeated chars in the delimiter string has not the desired effect.

To convince yourself, replace the delimiter by GMT, you'll see that your header is now H, because strtok tokenized from T, as it is in delimiter list.

An alternative (dirty/no error checking) would be to locate the delimiter with strstr, put zero there, and print both strings

#include <stdio.h>
#include <string.h>

int main ()
{

    char get[1024] = "HTTP/1.1 200 OK\r\nDate: Tue, 23 Jul 2019 22:52:44 GMT\r\nServer: Apache/2.4.7 (Ubuntu)\r\nLast-Modified: Wed, 18 Aug 2004 23:07:08 GMT\r\nETag: \"164-3e1f5b9a57f00\"\r\nAccept-Ranges: bytes\r\nContent-Length: 356\r\nVary: Accept-Encoding\r\nContent-Type: text/html\r\n\r\n<html><head><title>Temporary Page</title></head><body><center><h1>New Account Temporary Page</h1>Welcome to yet another new account at Hurricane Electric.<p><h2>Come Back Soon!</h2>The New Account Owner Will Be Putting Something Interesting Here!</center><hr>Space Provided By <a href=\"http://www.he.net\">Hurricane Electric</a></body></html>";

    const char *delim = "\r\n\r\n";

    char *html;

    html = strstr(get, delim);  // should test for NULL ...
    html[0] = '\0';
    printf("Header:\n\n%s\n\n", get);
    printf("HTML:\n\n%s\n\n", html+strlen(delim));

    return 1;

}

output:

Header:

HTTP/1.1 200 OK
Date: Tue, 23 Jul 2019 22:52:44 GMT
Server: Apache/2.4.7 (Ubuntu)
Last-Modified: Wed, 18 Aug 2004 23:07:08 GMT
ETag: "164-3e1f5b9a57f00"
Accept-Ranges: bytes
Content-Length: 356
Vary: Accept-Encoding
Content-Type: text/html

HTML:

<html><head><title>Temporary Page</title></head><body><center><h1>New Account Temporary Page</h1>Welcome to yet another new account at Hurricane Electric.<p><h2>Come Back Soon!</h2>The New Account Owner Will Be Putting Something Interesting Here!</center><hr>Space Provided By <a href="http://www.he.net">Hurricane Electric</a></body></html>

Upvotes: 4

Related Questions