Sean Lynch
Sean Lynch

Reputation: 2983

Boost.asio socket - Why no bytes available

I'm new to using boost.asio but I wanted to design a simple routine to get data from a URL and save it to a memory buffer. Based on an example I found, I came up with the following:

//data_url.hpp

#include <boost/asio.hpp>
#include <string>
#include <vector>

struct data {
  data();
  boost::asio::io_service io_service;
  boost::asio::ip::tcp::resolver resolver;
  boost::asio::ip::tcp::socket socket;
  std::vector<char> buffer;
  std::string host;
  std::string path;
  std::string port;
};

void setup_url(std::string hst, 
               std::string pth, 
               std::string prt = "80");

std::vector<char> & get_data_from_url();

void resolve_handler(const boost::system::error_code & ec,
                     boost::asio::ip::tcp::resolver::iterator it);

void connect_handler(const boost::system::error_code & ec);

void read_handler(const boost::system::error_code & ec,
                  std::size_t bytes_transferred);

Implementation

//data_url.cpp

#include <data_url.hpp>
#include <iostream>

data::data() : io_service(), resolver(io_service), socket(io_service)
             , buffer(), host(), path(), port() {}
data d;

void setup_url(std::string hst, std::string pth, std::string prt) {
  d.host = hst;
  d.path = pth;
  d.port = prt;
}

std::vector<char> & get_data_from_url() {
  boost::asio::ip::tcp::resolver::query query(d.host, d.port);
  d.resolver.async_resolve(query, resolve_handler);
  d.io_service.run();
  return d.buffer;
}

void resolve_handler(const boost::system::error_code & ec,
                     boost::asio::ip::tcp::resolver::iterator it) {
  if( !ec ) {
    d.socket.async_connect(*it, connect_handler);
  }
}

void connect_handler(const boost::system::error_code & ec) {
  if( !ec ) {
    boost::asio::write(d.socket,
                       boost::asio::buffer(std::string("GET ") + 
                                           std::string(d.path) +
                                           std::string(" HTTP 1.1\r\n") + 
                                           std::string("Host: ") +
                                           std::string(d.host) +
                                           std::string("\r\n\r\n")));
    boost::system::error_code ec_avail;
    d.buffer.resize(d.socket.available(ec_avail));
    d.socket.async_read_some(boost::asio::buffer(d.buffer), 
                             read_handler);
  }
}

void read_handler(const boost::system::error_code & ec,
                  std::size_t bytes_transferred) {
  if( !ec ) {
    d.socket.async_read_some(boost::asio::buffer(d.buffer), 
                             read_handler);
  }
}

I then run it with

#include <data_url.hpp>

int main(int argc, char *argv[]) {
  setup_url("www.boost.org", "/");
  std::vector<char> data;
  data = get_data_from_url();

  return 0;
}

The code calls read_handler endlessly and never terminates. I've tried it with different pages and that doesn't make a difference.

Also, in the content_handler function I resize the vector using socket.available(). This is to make the code as general as possible so that I could read any page without having to know it's size. But when I call socket.available() it returns zero and sets the error code in ec_avail to "Undefined error".

As I mentioned, I'm new to using boost.asio and obviously I'm missing something here. I'd appreciate help fixing these errors and any other advice/suggestions.

Upvotes: 3

Views: 1650

Answers (1)

Gerald
Gerald

Reputation: 23499

Your socket is unlikely to have data available immediately after calling write, so your resizing method is probably not going to work, and you are then calling async_read_some with a buffer of size 0.

The whole point of async_read_some is that it doesn't need to fill the entire buffer before the async operation completes, so use a temporary buffer with a basic block size of something like 8192 and after each successful call to your read handler, append the data read to the end of your primary buffer.

ASIO isn't an HTTP library, so if you're using it to download files via HTTP you'll need to do a lot of processing of the data on the fly to parse the header and determine the type of response it is, determine the content length and encoding, etc.

If you just want to use it for HTTP stuff, I would highly recommend using Poco instead, as it has built-in support for HTTP streams.

Upvotes: 3

Related Questions