user64675
user64675

Reputation: 366

Binary de-serialization - how to find, what type of serialization is it?

I have a .dat file generated by a program. The program owner is not against parsing and editing this file, but he will not give anyone answers.

The file mostly consists of variables that are defined in this way:

In most cases:

(4 bytes - length of the var name)(var name)(4 bytes of some internal var type)(4 bytes - possibly are elements count)(X bytes of var value)

Rarely:

(4 bytes - length of the var name)(var name)(1 zero byte)(4 bytes of some internal var type)

So, for example:

([4 0 0 0][name])[11 0 0 0][1 0 0 0]([9 0 0 0][Alexander])

and

([8 0 0 0][names])[6 0 0 0](length [3 0 0 0])([4 0 0 0][John])([4 0 0 0][Anne])([7 0 0 0][SomeGuy])

I tried to look at boost binary serialization but it doesn't add variable names in the file and I think uses 8 bytes, not 4.

Upvotes: 0

Views: 86

Answers (2)

bolov
bolov

Reputation: 75825

To add the the answer of BoundaryImposition there is no deserialization framework (that I know of) that can deal with "any" format. The format must be known and implemented by the library. So you need to do implement it yourself.

Upvotes: 0

Lightness Races in Orbit
Lightness Races in Orbit

Reputation: 385274

There is no generic way to determine "what type of serialization" it is. The author of the format has made design decisions and arrived at a final format. It could be literally anything. You can make educated guesses ("reverse engineering") but the only way to know for sure is to obtain a specification from the author. Although you claim that he doesn't mind people manipulating files stored in this format, his refusal to provide said specification makes me wonder whether this is really true and, ultimately, means you may have to stick with guesswork.

Upvotes: 2

Related Questions