Harry Boy
Harry Boy

Reputation: 4745

Parse HTTP headers in C++

I am using curl to communicate with a server.

When I make a request for data I receive the HTTP headers followed by jpeg data separated by a boundary like so:

enter image description here

I need to parse out

  1. The boundary string
  2. The Content-Length.

I have copied the incoming data to a a char array like so:

static size_t OnReceiveData ( void * pvData, size_t tSize, size_t tCount, void * pvUser )
{
    printf("%*.*s", tSize * tCount, tSize * tCount, pvData);

    char* _data;
    if(pvData != nullptr && 0 != tCount)
    {
        _data = new char[tCount];
       memcpy(_data, pvData, tCount);
    }

    return ( tCount );
}

How can I best do this in C++?? How do I actually inspect and parse the _data array for the information that I want?? Are the any boost libraries that I can use for example??

Upvotes: 5

Views: 22036

Answers (3)

jvandenbroek
jvandenbroek

Reputation: 39

I would put all headers in a map, after which you can easily iterate through it. No boost needed. Here a basic working example with libcurl:

#include <iostream>
#include <string>
#include <map>
#include <curl/curl.h>

static size_t OnReceiveData (void * pData, size_t tSize, size_t tCount, void * pmUser)
{
    size_t length = tSize * tCount, index = 0;
    while (index < length)
    {
        unsigned char *temp = (unsigned char *)pData + index;
        if ((temp[0] == '\r') || (temp[0] == '\n'))
            break;
        index++;
    }

    std::string str((unsigned char*)pData, (unsigned char*)pData + index);
    std::map<std::string, std::string>* pmHeader = (std::map<std::string, std::string>*)pmUser;
    size_t pos = str.find(": ");
    if (pos != std::string::npos)
        pmHeader->insert(std::pair<std::string, std::string> (str.substr(0, pos), str.substr(pos + 2)));

    return (tCount);
}

int main(int argc, char* argv[])
{
    CURL *curl = curl_easy_init();
    if (!curl)
        return 1;

    std::map<std::string, std::string> mHeader;

    curl_easy_setopt(curl, CURLOPT_URL, "http://www.example.com");
    curl_easy_setopt(curl, CURLOPT_HEADERFUNCTION, OnReceiveData);
    curl_easy_setopt(curl, CURLOPT_HEADERDATA, &mHeader);
    curl_easy_setopt(curl, CURLOPT_NOBODY, true);
    curl_easy_perform(curl);
    curl_easy_cleanup(curl);

    std::map<std::string, std::string>::const_iterator itt;
    for (itt = mHeader.begin(); itt != mHeader.end(); itt++)
    {
        if (itt->first == "Content-Type" || itt->first == "Content-Length")
            std::cout << itt->first << ": " << itt->second << std::endl;
    }
}

Upvotes: 3

Marshall Clow
Marshall Clow

Reputation: 16670

The cpp-netlib project (based on boost) contains a full MIME parser (written with boost.spirit).

I'm not really that happy with the interface of the parser, but it works well.

Upvotes: 1

Grigorii Chudnov
Grigorii Chudnov

Reputation: 3112

You could parse the headers on the fly or put them into a map and post-process later. Use find, substr methods from the std::string. Look at Boost String Algorithms Library, it contains lots of algorithms, e.g. trim

e.g. to place headers into the std::map and print them (rough cuts):

#include <stdlib.h>
#include <iostream>
#include <sstream>
#include <string>
#include <map>
#include <boost/algorithm/string.hpp>

int main(int argc, char* argv[]) {
  const char* s = "HTTP/1.1 200 OK\r\n"
    "Content-Type: image/jpeg; charset=utf-8\r\n"
    "Content-Length: 19912\r\n\r\n";

  std::map<std::string, std::string> m;

  std::istringstream resp(s);
  std::string header;
  std::string::size_type index;
  while (std::getline(resp, header) && header != "\r") {
    index = header.find(':', 0);
    if(index != std::string::npos) {
      m.insert(std::make_pair(
        boost::algorithm::trim_copy(header.substr(0, index)), 
        boost::algorithm::trim_copy(header.substr(index + 1))
      ));
    }
  }

  for(auto& kv: m) {
    std::cout << "KEY: `" << kv.first << "`, VALUE: `" << kv.second << '`' << std::endl;
  }

  return EXIT_SUCCESS;
}

You will get the output:

KEY: `Content-Length`, VALUE: `19912`
KEY: `Content-Type`, VALUE: `image/jpeg; charset=utf-8`

Having the headers, you could extract the required ones for post-processing.

Upvotes: 5

Related Questions