Grim Fandango
Grim Fandango

Reputation: 2426

Writing an class/struct that changes frequently

Summary:
I have a struct that is read/written to file.
This struct changes frequently, and this causes my read() function to get complex.

I need to find a good way to handle change while keeping the bug count low. Optimally, code should be make it easy for one to spot the changes between versions.

I have thought through a couple of patterns but I don't know if I have gone through all possible options.

As you will see, the code was mostly in C-like, but I am in the process of turning it into C++.


Details
As I said, my struct changes frequently (almost in every version of the program).

So far, changes to the struct have been handled like:


struct Obj {
    int color_index;  
};  

void Read_Obj( File *f, Obj *o ) {
    f->read( f, &o->color_index );
}

void Write_Obj( File *f, Obj *o ) {
    f->write( f, o->color_index );
}

struct Obj {
    int color_r;
    int color_g;
    int color_b;  
};  

void Read_Obj( File *f, Obj *o ) {

    if( f->version() == File::Version1 ) {
        int color_index;
        f->read( f, &color_index );
        ColorIndex_to_RGB( o, color_index ); // we used color maps back then
    }
    else {
        f->read( f, &o->color_r );
        f->read( f, &o->color_g );
        f->read( f, &o->color_b );
    }
}      

void Write_Obj( File *f, Obj *o ) {
    f->write( f, o->color_r );
    f->write( f, o->color_g );
    f->write( f, o->color_b );
}

[brief note]

Note here that I know could have used


void Read_Obj( File *f, Obj *o ) {

    if( f->version() == File::Version1 ) {
        Read_Obj_V1( f, o );
    }
    else {
        Read_Obj_V2( f, o );
    }
}      

but that tends to code duplication between each of the sub-functions, since, in real life, only 1-2 out of ~20 members of the struct changes on each version. So, the other 18 lines remain the same.

Of course, I could change to this policy if for a good reason

[end of brief note]


Now these structs have become complicated and I need to convert them to a class, and work in a more object-oriented fashion.

I have seen a pattern where you use one class to read for each old version, and then convert the data to a newer class.


class Obj_v1 {
    int m_color_index;
    read( File *f ) {
        f->read( f, &m_color_index );
   }

   void convert_to( Obj * ) { /* code to convert the older object */  }
};

class Obj {
    int m_r;
    int m_g;
    int m_b;
    read( File *f ) {
        f->read( f, &m_r );
        f->read( f, &m_g );
        f->read( f, &m_b );
   }

};

void Read_Obj( File *f, Obj *o ) {

    if( f.version() == File::Version1 ) {
        Obj_v1 old();
        old.read( f );
        old.convert_to( o );
    }
    else {
        o.read( f );
    }
}      

void Write_Obj( File *f, Obj *o ) {
    o->write( f );
}

However, there are two strategies for dealing with change:

Strategy 1 : direct conversions

void Read_Obj( File *f, Obj *o ) {

    if( f->version() == File::Version1 ) {
        Obj_v1 old();
        old.read( f );
        old.convert_to( o );
    }
    else if( f->version() == File::Version2 ) {
        Obj_v2 old();
        old.read( f );
        old.convert_to( o );
    }
    else {
        o.read( f );
    }
}      

Disadvantage:

Benefit:

Strategy 2 : cascaded conversions

void Read_Obj( File *f, Obj *o ) {

    Obj_v1 o1();
    Obj_v2 o2();

    if( f->version() == File::Version1 ) {
        o1.read( f );
        o1.convert_to( o2 );
        o2.convert_to( o );
    }
    else if( f->version() == File::Version2 ) {
        o2.read( f );
        o2.convert_to( o );
    }
    else {
        o.read( f );
    }
}      

Disadvantages:

Benefit:

Worries:

Question:

thank you so much

Upvotes: 4

Views: 416

Answers (2)

Matthieu M.
Matthieu M.

Reputation: 299960

You may be able to put Google Protocol Buffers to work.

The main idea beyond protobuf is to decorrelate the actual serialization from the class information, because you create a class dedicated to the serialization... but the real benefit lies elsewhere.

The information encoded by protobuf is naturally both backward and forward compatible, so you if you add information and decode an old file: the new information won't be there. On the other hand, if you remove information, it'll skip it during the decoding.

This means that you leave the version handling to protobuf (without any real version number in fact) and then when changing your class:

  • you stop retrieving the information you don't need any longer
  • you add new fields for the new pieces of information you have

It may also help you think better about what to save and in which format, it is okay to transform the data before saving it (encoding) and transform it back when reading (decoding), so the actual format of the save should change less frequently (you would add items, but you should not have to refactor the already encoded data too frequently).

Upvotes: 2

Dummy00001
Dummy00001

Reputation: 17420

void Read_Obj( File *f, Obj *o ) {
if( f->version() == File::Version1 ) {

The if is so to say a hidden switch/case. And switch/case in C++ is generally interchangeable with polymorphism. Example:

struct Reader {
   virtual void Read_Obj( File *f, Obj *o ) = 0;
   /* methods to read further objects */
}

struct ReaderV1 : public Reader {
   void Read_Obj( File *f, Obj *o ) { /* ... */ };
   /* methods to read further objects */
}

struct ReaderV2 : public Reader {
   void Read_Obj( File *f, Obj *o ) { /* ... */ };
   /* methods to read further objects */
}

And then instantiate the appropriate Reader descendant after opening the file and detecting the version number. That way you would have only one file version check in the top level code, instead of polluting all of the low-level code with the checks.

If code is common between the file version, for convenience you can also put it into the base reader class.

I would strongly advise against the variant with class Obj_v1 and class Obj where the read() method belongs to the Obj itself. This way one easily end-up with circular dependencies and also it is a bad idea to make an object aware of its persistent presentation. IME (in my experience) it is better to have the 3rd party reader class hierarchy responsible for that. (As in the std::iostream vs. std::string vs. operator <<: stream doesn't know string, string doesn't know stream, only the opeartor << knows both.)

Otherwise, I personally do not see any big difference between your "Strategy 1" and "Strategy 2". They both use the convert_to() what I personally think is superficial. IME solution with the polymorphism should be used instead - automatically converting everything to the up-to-date version of the object class Obj, without the intermediate class Obj_v1 and class Obj_v2. Since with polymorphism you would have a dedicated read function for every version, ensuring proper object recreation from the read information is easy.

Are there any other patterns that do a better job at this? The ones of you that had some experience with my proposals, what do you think of my worries on the above implementations? Which are preferable solutions?

This is precisely what polymorphism was intended to address and how I generally do such tasks myself.

This is related to object serialization, but I have not seen a single serialization framework (my info is likely outdated) which was capable of supporting several version of the same class.

I personally did end up several times with the following serialization/deserialization class hierarchy:

  • abstract reader interface (very slim by definition)
  • utility classes implementing the reading and writing of the actual objects from/to the streams (fat, highly reusable code, was used for network transfers too)
  • versioned implementations of the reader interface (relatively slim, reuse the fat utility classes)
  • writer interface/class (I was always writing up-to-date version of the file. Versioning was using only during reading.)

Hope that helps.

Upvotes: 3

Related Questions