jekelija
jekelija

Reputation: 277

Sending raw data over boost::asio

i am attempting to send raw data over boost::asio, as boost::serialization is too slow for my needs. Following various examples and boost documentation, i have a client:

SimulationClient:

 void SimulationClient::sendData(std::vector<WaveformDefinition>waveformPackets) {
        socket.async_send_to(boost::asio::buffer(waveformPackets),
                receiver_endpoint,
                boost::bind(&ClientEnvironmentEngine::sendComplete, this,
                        boost::asio::placeholders::error,
                        boost::asio::placeholders::bytes_transferred));
}

I attempted Tanner Sansbury's solution below, but was unable to get it to work. However, I am having success using:

class WaveformReceiver {
     WaveformDefinition *buffer;

     WaveformReceiver(){
         buffer = new WaveformDefinition[MAX_WAVEFORMS];
         startReceive();
     }

     void startReceive() {
         socket_.async_receive_from(boost::asio::null_buffers(), remote_endpoint_,
               boost::bind(&WaveformReceiver::handleReceive, this,
               boost::asio::placeholders::error,
               boost::asio::placeholders::bytes_transferred));
     }

     void handleReceive(const boost::system::error_code& error,
        std::size_t size/*bytes_transferred*/)
     {
          if (!error)
          {
               int available = socket_.available();
               int numWaveforms = available / sizeof(WaveformDefinition_c);
               socket_.receive(boost::asio::buffer(buffer, available));

               //copy buffer into another buffer so we can re-use the original buffer for the next read
               WaveformDefinition_c* tempBuffer = new WaveformDefinition_c[numWaveforms];
               std::memcpy ( tempBuffer, buffer, available );

               //schedule a thread to handle the array of waveforms that we copied
               threadPool.schedule( boost::bind( handleWaveforms, tempBuffer, numWaveforms));
               //start listening for more waveforms
               startReceive();
          }
     }
}

Tanner, or others, can you tell me if what I am doing should also work, or if i am just getting lucky that it is currently working?

Upvotes: 2

Views: 5558

Answers (2)

Tanner Sansbury
Tanner Sansbury

Reputation: 51891

The fundamental part of the question is about serializing and deserializing collections.

Without controlling the compiler and architectures of both the server and client, sending raw structures is typically unsafe as the byte representation may differ between systems. While the compiler and architecture are the same in this specific case, the #pragma pack(1) is irrelevant as WAVEFORM_DATA_STRUCT is not being written as raw memory to the socket. Instead, a multiple buffers of memory are provided for a gathering write operation.

boost::array<boost::asio::mutable_buffer,2> buffer = {{
  boost::asio::buffer(&waveformPacket->numWaveforms, ...), // &numWaveforms
  boost::asio::buffer(waveformPacket->waveforms)           // &waveforms[0]
}};

There are various tools to help with serializing data structures, such as Protocol Buffers.


The code below will demonstrate the basics for serializing a data structure for network communication. To simplify the code and explanation, I have chosen to focus on serialization and deserialization, rather than writing and reading from the socket. Another example located below this section will show more of a raw approach, that assumes the same compiler and architecture.

Starting with a basic foo type:

struct foo
{
  char a;
  char b;
  boost::uint16_t c;
};

It can be determined that the data can be packed into 4 total bytes. Below is one possible wire reprensetation:

0        8       16       24       32
|--------+--------+--------+--------|
|   a    |   b    |        c        |
'--------+--------+--------+--------'

With the wire representation determined, two functions can be used to serialize (save) a foo object to a buffer, and another can be used to deserialize (load) foo from a buffer. As foo.c is larger than a byte, the functions will also need to account for endianness. I opted to use the endian byte swapping functions in the Boost.Asio detail namespace for some platform neutrality.

/// @brief Serialize foo into a network-byte-order buffer.
void serialize(const foo& foo, unsigned char* buffer)
{
  buffer[0] = foo.a;
  buffer[1] = foo.b;

  // Handle endianness.
  using ::boost::asio::detail::socket_ops::host_to_network_short;
  boost::uint16_t c = host_to_network_short(foo.c);
  std::memcpy(&buffer[2], &c, sizeof c);
}

/// @brief Deserialize foo from a network-byte-order buffer.
void deserialize(foo& foo, const unsigned char* buffer)
{
  foo.a = buffer[0];
  foo.b = buffer[1];

  // Handle endianness.
  using ::boost::asio::detail::socket_ops::network_to_host_short;
  boost::uint16_t c;
  std::memcpy(&c, &buffer[2], sizeof c);
  foo.c = network_to_host_short(c);
}

With the serialization and deserialization done for foo, the next step is to handle a collection of foo objects. Before writing the code, the wire representation needs to be determine. In this case, I have decided to prefix a sequence of foo elements with a 32-bit count field.

0        8       16       24       32
|--------+--------+--------+--------|
|       count of foo elements [n]   |
|--------+--------+--------+--------|
|         serialized foo [0]        |
|--------+--------+--------+--------|
|         serialized foo [1]        |
|--------+--------+--------+--------|
|                ...                |
|--------+--------+--------+--------|
|         serialized foo [n-1]      |
'--------+--------+--------+--------'

Once again, two helper functions can be introduced to serialize and deserialize collection of foo objects, and will also need to account for the byte order of the count field.

/// @brief Serialize a collection of foos into a network-byte-order buffer.
template <typename Foos>
std::vector<unsigned char> serialize(const Foos& foos)
{
  boost::uint32_t count = foos.size();

  // Allocate a buffer large enough to store:
  //   - Count of foo elements.
  //   - Each serialized foo object.
  std::vector<unsigned char> buffer(
      sizeof count +            // count
      foo_packed_size * count); // serialize foo objects

  // Handle endianness for size.
  using ::boost::asio::detail::socket_ops::host_to_network_long;
  count = host_to_network_long(count);

  // Pack size into buffer.
  unsigned char* current = &buffer[0];
  std::memcpy(current, &count, sizeof count);
  current += sizeof count; // Adjust position.

  // Pack each foo into the buffer.
  BOOST_FOREACH(const foo& foo, foos)
  {
    serialize(foo, current);
    current += foo_packed_size; // Adjust position.
  }

  return buffer;
};

/// @brief Deserialize a buffer into a collection of foo objects.
std::vector<foo> deserialize(const std::vector<unsigned char>& buffer)
{
  const unsigned char* current = &buffer[0];

  // Extract the count of elements from the buffer.
  boost::uint32_t count;
  std::memcpy(&count, current, sizeof count);
  current += sizeof count;

  // Handle endianness.
  using ::boost::asio::detail::socket_ops::network_to_host_long;
  count = network_to_host_long(count);

  // With the count extracted, create the appropriate sized collection.
  std::vector<foo> foos(count);

  // Deserialize each foo from the buffer.
  BOOST_FOREACH(foo& foo, foos)
  {
    deserialize(foo, current);
    current += foo_packed_size;
  }

  return foos;
};

Here is the complete example code:

#include <iostream>
#include <vector>
#include <boost/asio.hpp>
#include <boost/asio/detail/socket_ops.hpp> // endian functions
#include <boost/cstdint.hpp>
#include <boost/foreach.hpp>
#include <boost/tuple/tuple.hpp>            // boost::tie
#include <boost/tuple/tuple_comparison.hpp> // operator== for boost::tuple

/// @brief Mockup type.
struct foo
{
  char a;
  char b;
  boost::uint16_t c;
};

/// @brief Equality check for foo objects.
bool operator==(const foo& lhs, const foo& rhs)
{
  return boost::tie(lhs.a, lhs.b, lhs.c) ==
         boost::tie(rhs.a, rhs.b, rhs.c);
}

/// @brief Calculated byte packed size for foo.
///
/// @note char + char + uint16 = 1 + 1 + 2 = 4
static const std::size_t foo_packed_size = 4;

/// @brief Serialize foo into a network-byte-order buffer.
///
/// @detail Data is packed as follows:
///
///   0        8       16       24       32
///   |--------+--------+--------+--------|
///   |   a    |   b    |        c        |
///   '--------+--------+--------+--------'
void serialize(const foo& foo, unsigned char* buffer)
{
  buffer[0] = foo.a;
  buffer[1] = foo.b;

  // Handle endianness.
  using ::boost::asio::detail::socket_ops::host_to_network_short;
  boost::uint16_t c = host_to_network_short(foo.c);
  std::memcpy(&buffer[2], &c, sizeof c);
}

/// @brief Deserialize foo from a network-byte-order buffer.
void deserialize(foo& foo, const unsigned char* buffer)
{
  foo.a = buffer[0];
  foo.b = buffer[1];

  // Handle endianness.
  using ::boost::asio::detail::socket_ops::network_to_host_short;
  boost::uint16_t c;
  std::memcpy(&c, &buffer[2], sizeof c);
  foo.c = network_to_host_short(c);
}

/// @brief Serialize a collection of foos into a network-byte-order buffer.
///
/// @detail Data is packed as follows:
///
///   0        8       16       24       32
///   |--------+--------+--------+--------|
///   |       count of foo elements [n]   |
///   |--------+--------+--------+--------|
///   |         serialized foo [0]        |
///   |--------+--------+--------+--------|
///   |         serialized foo [1]        |
///   |--------+--------+--------+--------|
///   |                ...                |
///   |--------+--------+--------+--------|
///   |         serialized foo [n-1]      |
///   '--------+--------+--------+--------'
template <typename Foos>
std::vector<unsigned char> serialize(const Foos& foos)
{
  boost::uint32_t count = foos.size();

  // Allocate a buffer large enough to store:
  //   - Count of foo elements.
  //   - Each serialized foo object.
  std::vector<unsigned char> buffer(
      sizeof count +            // count
      foo_packed_size * count); // serialize foo objects

  // Handle endianness for size.
  using ::boost::asio::detail::socket_ops::host_to_network_long;
  count = host_to_network_long(count);

  // Pack size into buffer.
  unsigned char* current = &buffer[0];
  std::memcpy(current, &count, sizeof count);
  current += sizeof count; // Adjust position.

  // Pack each foo into the buffer.
  BOOST_FOREACH(const foo& foo, foos)
  {
    serialize(foo, current);
    current += foo_packed_size; // Adjust position.
  }

  return buffer;
};

/// @brief Deserialize a buffer into a collection of foo objects.
std::vector<foo> deserialize(const std::vector<unsigned char>& buffer)
{
  const unsigned char* current = &buffer[0];

  // Extract the count of elements from the buffer.
  boost::uint32_t count;
  std::memcpy(&count, current, sizeof count);
  current += sizeof count;

  // Handle endianness.
  using ::boost::asio::detail::socket_ops::network_to_host_long;
  count = network_to_host_long(count);

  // With the count extracted, create the appropriate sized collection.
  std::vector<foo> foos(count);

  // Deserialize each foo from the buffer.
  BOOST_FOREACH(foo& foo, foos)
  {
    deserialize(foo, current);
    current += foo_packed_size;
  }

  return foos;
};

int main()
{
  // Create a collection of foo objects with pre populated data.
  std::vector<foo> foos_expected(5);
  char a = 'a',
       b = 'A';
  boost::uint16_t c = 100;

  // Populate each element. 
  BOOST_FOREACH(foo& foo, foos_expected)
  {
    foo.a = a++;
    foo.b = b++;
    foo.c = c++;
  }

  // Serialize the collection into a buffer.
  std::vector<unsigned char> buffer = serialize(foos_expected);

  // Deserialize the buffer back into a collection.
  std::vector<foo> foos_actual = deserialize(buffer);

  // Compare the two.
  std::cout << (foos_expected == foos_actual) << std::endl; // expect 1

  // Negative test.
  foos_expected[0].c = 0;
  std::cout << (foos_expected == foos_actual) << std::endl; // expect 0
}

Which produces the expected results of 1 and 0.


If using the same compiler and architecture, then it may be possible to reinterpret a contiguous sequence of foo objects from a raw buffer as an array of foo objects, and populate std::vector<foo> with copy constructors. For example:

// Create and populate a contiguous sequence of foo objects.
std::vector<foo> foo1;
populate(foo1);

// Get a handle to the contiguous memory block.
const char* buffer = reinterpret_cast<const char*>(&foo1[0]);

// Populate a new vector via iterator constructor.  
const foo* begin = reinterpret_cast<const foo*>(buffer);
std::vector<foo> foos2(begin, begin + foos1.size());

In the end, foo1 should be equal to foo2. The foo objects in foo2 will be copy-constructed from the reinterpreted foo objects residing in memory owned by foo1.

#include <iostream>
#include <vector>
#include <boost/cstdint.hpp>
#include <boost/foreach.hpp>
#include <boost/tuple/tuple.hpp>            // boost::tie
#include <boost/tuple/tuple_comparison.hpp> // operator== for boost::tuple

/// @brief Mockup type.
struct foo
{
  char a;
  char b;
  boost::uint16_t c;
};

/// @brief Equality check for foo objects.
bool operator==(const foo& lhs, const foo& rhs)
{
  return boost::tie(lhs.a, lhs.b, lhs.c) ==
         boost::tie(rhs.a, rhs.b, rhs.c);
}

int main()
{
  // Create a collection of foo objects with pre populated data.
  std::vector<foo> foos_expected(5);
  char a = 'a',
       b = 'A';
  boost::uint16_t c = 100;

  // Populate each element. 
  BOOST_FOREACH(foo& foo, foos_expected)
  {
    foo.a = a++;
    foo.b = b++;
    foo.c = c++;
  }

  // Treat the collection as a raw buffer.
  const char* buffer =
      reinterpret_cast<const char*>(&foos_expected[0]);

  // Populate a new vector.  
  const foo* begin = reinterpret_cast<const foo*>(buffer);
  std::vector<foo> foos_actual(begin, begin + foos_expected.size());

  // Compare the two.
  std::cout << (foos_expected == foos_actual) << std::endl; 

  // Negative test.
  foos_expected[0].c = 0;
  std::cout << (foos_expected == foos_actual) << std::endl;
}

As with the other approach, this produces the expected results of 1 and 0.

Upvotes: 4

Galimov Albert
Galimov Albert

Reputation: 7357

First, its not safe to use pragma pack(1). Packing can differ from different compilers/arch. Also, you will get problems on protocol change. I suggest to use google protobuf instead.

Second. You are sending std::vector but actual data of this vector is not inside structure WAVEFORM_DATA_STRUCT (vector holds its data in the heap). So, you sending vector and its pointer to heap to another machine, where is this pointer definetly invalid. You need to serialize your vector somehow.

P.S. There is nothing to do with boost::asio yet, this issue is about correct serializing/deserializing.

Upvotes: 0

Related Questions