Sergei Danielian
Sergei Danielian

Reputation: 5025

How to develop a class to parse a binary file using template and/or rules

What I try to do is to make a class based on templates, which will act as a python-bitstring or Python Hachoir. And the final goal is to describe some rules or patterns to parse a given binary files (101 hex editor has similar cool feature - Binary Templates).

Here is a simple example of how I see the realization:

typedef struct _BINARYFILEFORMAT
{
  short sMagic;
  int sField_1, sField_2;
  short sFrameLength;
  int OverallLength;
}
  BinaryFileFormat;

void main(int argc, char** argv)
{
  // Storage for parsed file
  BinaryFileFormat bff;

  // Describing new pattern to parse a binary file with statistical information
  Rule r = "<Magic>:2,
            <Field_1>:4,
            <Field_2>:4,
            <FrameLength>:2,
            <OverallLength>:4";

  // Parse the file and place all data into the structure
  BinaryParser bp(r, "<path_to_file>", bff);

  // Now we can work with BinaryFileFormat structure
  // ...
}

So, the question #1 is: how can I do such thing in c++? Is it possible at all? Simple clue or hint will be enough to move in right direction for me.

The question #2 is: is it possible to dynamically create a corresponding structure (in example it's BINARYFILEFORMAT) to store all binary data?

P.S. I know, deserialization do the similar things, but not in my case - at final result I want to avoid of using any structures (of course, if it would be possible).

Upvotes: 2

Views: 297

Answers (1)

Andrew Tomazos
Andrew Tomazos

Reputation: 68698

This sort of thing is usually done as a precompile step. A grammar definition is read by some tool. and this tool outputs C++ code that has data structures to hold the model as well as the code necessary to serialize and deserialize it to a file.

However you can look at boost::spirit and friends as a way to do this in C++ at (normal) compile-time.

If you want to do it yourself you can do something like...

Rule r = Rule("Magic", &BinaryFileFormat::sMagic, 2) +
         Rule("Field1", &BinaryFileFormat::sField_1, 4) +
         Rule("Field2", &BinaryFileFormat::sField_2, 4) +
         ...;

...and use type deduction to determine correct primitive encoding/decoding.

Upvotes: 3

Related Questions