Kyu96
Kyu96

Reputation: 1349

Parsing custom data packets in an object oriented manner

I am currently developing some software in C++ where I am sending and receiving custom data packets. I want to parse and manage these packets in a well structured manner. Obviously I am first receiving the header and after that the body of the data. The main problem is that I don't like creating a Packet-Object with only the header information and later on adding the body data. What is an elegant way of parsing and storing custom data packets?

Here is a rough sketch of what such a custom data packet could look like:

+-------+---------+---------+----------+------+
| Magic | Command | Options | Bodysize | Body |
+-------+---------+---------+----------+------+

(Lets assume Magic is 4 bytes, Command 1 byte, Options 2 bytes, Bodysize 4 bytes and the body itself is variable in length.) How would I parse this without using any third party libraries?

Normally I'd say something like this could be done to store packet data:

#include <array>

class Packet {
public:

    explicit Packet(std::array<char, 10> headerbytes);

    void set_body(std::vector<char> data);
    std::vector<char> get_body();

    int8_t get_command();

    int16_t get_options();

    bool is_valid();

private:

    bool valid;

    int8_t _command;

    int16_t _options;

    int32_t body_size;

    std::vector<char> _data;

};

The problem is that I provide the header-information first and than add the body data in a hacky way later on. The packet object has a point of time where it is accessible in an incomplete state.

I first receive the header and after the header was received another receive call is made to read the body. Would it make sense to have a parser instance that populates information into the packet object only make it accessible once it holds all needed information? Would it make sense to have a separate class for the header and the body? What would be the best design choice?

I am developing with C++ and for the sending and receiving of data over sockets the boost library is used.

Upvotes: 3

Views: 2474

Answers (5)

Tomaz Stih
Tomaz Stih

Reputation: 579

I'm late to the party, but I have a similar problem. I'm implementing the GDB stub protocol and I'm thinking about using the chain of responsibility (similar to the pipeline). The idea is to have a base message class msg and various messages

breakpoint_msg, step_msg, mread_msg, etc...

all derived from it. Each of them has a function can_handle(package). When I receive a package I iterate through all message classes calling can_handle(package) function on them. The one that recognizes the message is the one that get instantiated with the package data. I then call exec() on it.

Upvotes: 0

U. Sivri
U. Sivri

Reputation: 28

You can use exceptions to prevent creation of incomplete packet objects.

I'd use char pointers instead of vectors for performance.

// not intended to be inherited
class Packet final {
public:
    Packet(const char* data, unsigned int data_len) {
        if(data_len < header_len) {
            throw std::invalid_argument("data too small");
        }

        const char* dataIter = data;

        if(!check_validity(dataIter)) {
            throw std::invalid_argument("invalid magic word");
        }
        dataIter += sizeof(magic);
        memcpy(&command, dataIter, sizeof(command)); // can use cast & assignment, too
        dataIter += sizeof(command);
        memcpy(&options, dataIter, sizeof(options)); // can use cast & assignment, too
        dataIter += sizeof(options);
        memcpy(&body_size, dataIter, sizeof(body_size)); // can use cast & assignment, too
        dataIter += sizeof(body_size);

        if( data_len < body_size+header_len) {
            throw std::invalid_argument("data body too small");
        }

        body = new char[body_size];
        memcpy(body, dataIter, body_size);
    }

    ~Packet() {
        delete[] body;
    }

    int8_t get_command() const {
        return command;
    }

    int16_t get_options() const {
        return options;
    }

    int32_t get_body_size() const {
        return body_size;
    }

    const char* get_body() const {
        return body;
    }

private:
    // assumes len enough, may add param in_len for robustness
    static bool check_validity(const char* in_magic) {
        return ( 0 == memcmp(magic, in_magic, sizeof(magic)) );
    }

    constexpr static char magic[] = {'a','b','c','d'};
    int8_t command;
    int16_t options;
    int32_t body_size;
    char* body;

    constexpr static unsigned int header_len = sizeof(magic) + sizeof(command)
            + sizeof(options) + sizeof(body_size);
};

Note: this is my first post in SO, so please let me know if something's wrong with the post, thanks.

Upvotes: 1

Flaviu
Flaviu

Reputation: 1029

For this case I would use the pipeline design pattern creating 3 packet processor classes:

  • Command (handles magic bytes too)
  • Options
  • Body (handles body size too)

all derived from one base class.

typedef unsigned char byte;

namespace Packet
{
    namespace Processor
    {
        namespace Field
        {
            class Item
            {
            public:
                /// Returns true when the field was fully processed, false otherwise.
                virtual bool operator () (const byte*& begin, const byte* const end) = 0;
            };

            class Command: public Item
            {
            public:
                virtual bool operator () (const byte*& begin, const byte* const end);
            };

            class Options: public Item
            {
            public:
                virtual bool operator () (const byte*& begin, const byte* const end);
            };

            class Body: public Item
            {
            public:
                virtual bool operator () (const byte*& begin, const byte* const end);
            };
        }

        class Manager
        {
        public:
            /// Called every time new data is received
            void operator () (const byte* begin, const byte* const end)
            {
                while((*fields[index])(begin, end))
                {
                    incrementIndex();
                }
            }

        protected:
            void incrementIndex();

            Field::Command command;
            Field::Options options;
            Field::Body body;
            Field::Item* const fields[3] = { &command, &options, &body };
            byte index;
        };
    }
}

Upvotes: 1

Flaviu
Flaviu

Reputation: 1029

I'm guessing you are trying Object-oriented networking. If so, the best solution for such parsing would be Flatbuffers or Cap’n Proto C++ code generator. By defining a schema, you will get state machine code that will parse the packets in an efficient and safe way.

Upvotes: 0

Davis Herring
Davis Herring

Reputation: 39768

If you don’t want to tie the data reading into one complete constructor (for understandable reasons of separation of concerns), this is a good application for non-polymorphic inheritance:

struct Header {
  static constexpr SIZE=10;
  Header(std::array<char,SIZE>);

  std::int8_t get_command() const {return command;}
  std::int16_t get_options() const {return options;}
  std::int32_t body_size() const {return length;}

private:
  std::int8_t command;
  std::int16_t options;
  std::int32_t length;
};

struct Packet : private Header {
  using Body=std::vector<char>;
  Packet(const Header &h,Body b) : Header(h),body(std::move(b))
  {if(body.size()!=body_size()) throw …;}

  using Header::get_command;
  using Header::get_options;
  const Body& get_body() const {return body;}

private:
  Body body;
};

// For some suitable Stream class:
Header read1(Stream &s)
{return {s.read<Header::SIZE>()};}
Packet read2(const Header &h,Stream &s)
{return {h,s.read(h.body_size())};}
Packet read(Stream &s)
{return read2(read1(s),s);}

Note that the private inheritance prevents undefined behavior from deleting a Packet via a Header*, as well as the surely-unintended

const Packet p=read(s);
const Packet q=read2(p,s);   // same header?!

Composition would of course work as well, but might result in more adapter code in a full implementation.

If you were really optimizing, you could make a HeaderOnly without the body size and derive Header and Packet from that.

Upvotes: 1

Related Questions