Aviv Cohn
Aviv Cohn

Reputation: 17233

Serialization of objects to files - is dumping char* back-and-forth safe?

I would like to write an object to a file, and later be able to read it from the file. It should work on different machines.

One simple option is the following:

struct OBJECT{ // The object to be serialized / deserialized
public:
  // Members are serialized / deserialized in the order they are declared. Can use bitpacking as well.
    DATATYPE member1;
    DATATYPE member2;
    DATATYPE member3;
    DATATYPE member4;
};

void write(const std::string& file_name, OBJECT& data) // Writes the given OBJECT data to the given file name.
{
    std::ofstream out;
    out.open(file_name,std::ios::binary);
    out.write(reinterpret_cast<char*>(&data), sizeof(OBJECT));
    out.close();
};

void read(const std::string& file_name, OBJECT& data) // Reads the given file and assigns the data to the given OBJECT.
{
    std::ifstream in;
    in.open(file_name,std::ios::binary);
    in.read(reinterpret_cast<char*>(&data), sizeof(OBJECT));
    in.close();
};

For serialization, this approach casts the struct to a char*, and then just writes it to the file according to its sizeof. For deserialization, the opposite is performed.

Please consider the following situation: we run the program on Machine 1 and save an object to a File 1. Later, we copy File 1 to Machine 2 and run the program (which might have been compiled with a different compiler) on Machine 2. We want to be able to read the data from the file.

Is this approach safe? Or is it better to "manually" read individual pieces from the file and copy them into the resulting struct, and vice versa?

Upvotes: 0

Views: 37

Answers (1)

cdhowie
cdhowie

Reputation: 169143

This approach is generally safe only if all of the following are true:

  • The type contains only "inline" data. This means only fundamental types (no pointers or references) and arrays thereof.
    • Nested objects that also meet these criteria are also OK.
  • All of the types have the same representation on all systems/compilers where the program will run.
    • The endian-ness of the system must match.
    • The size of each type must match.
    • Any padding is in the same location and of the same size.

However, there are still some drawbacks. One of the most severe is that you are locking yourself in to exactly this structure, without some kind of version mechanism. Other data interchange formats (XML, JSON, etc.), while more verbose, self-document their structure and are significantly more future-proof.

Upvotes: 2

Related Questions