smörkex
smörkex

Reputation: 336

How do I avoid code duplication in this example?

I have a simple library (let's call it Library #1) that manipulates some data vector (e.g. a time series). I want to make a new library (Library #2) which has essentially most (but not all) of the same functions, but acts on only a single data point in that vector. I imagine a solution that is a thin wrapper of the existing library one, that minimizes code duplication.

Here is the simple Library #1:

class Foo {
private: 
   std::vector<double> data;

public:
   // Some constructors

   double get_data_at_timepoint(int timepoint) const;

   // Some other methods
}

get_data_at_timepoint just returns the appropriate element of the data vector (assuming it exists). The other class in the library Bar has a container of Foo and manipulates them in some way - in particular, it can do_something, and you can also get a Foo:

class Bar {
private:
    std::vector<Foo> foos;

public:
    // Some constructors

    Foo get_this_foo(int idx) const;

    void do_something();

    // Some other methods
}

where (important) do_something calls get_data_at_timepoint in some way:

void Bar::do_something() {
   // ...
   double x = foos[some_idx].get_data_at_timepoint(some_timepoint);
   // ...
};

The Library #2 I want to also have is Library #1 at a single point in time. Something like:

class Foo2 {
private:
    double data;

public:
    double get_data() const;

    // All those other methods of Foo
}

class Bar2 {
private:
    std::vector<Foo2> foos;

public:
    Foo2 get_this_foo_2(int idx) const;
    void do_something();
    // All those other methods of Bar
}

where now:

void Bar2::do_something() {
   // ...
   double x = foos[some_idx].get_data();
   // ...
};

Clearly, Foo2 is basically just Foo, but with a single data entry. I could rewrite all of Foo, but then I would have to duplicate all the methods. I want instead to define a thin wrapper of Foo that is of length 1 (a single datapoint).

For Foo2, there are two options: (1) subclass Foo, or (2) have Foo2 be a wrapper around a unique ptr to Foo. I think (2) is better because the user should not have access to e.g. timepoints in base class Foo.

I also want to avoid writing extra code for Bar. The function do_something of course needs to be adapted slightly in Bar2, but overall these two seem so parallel. A lot of the other methods in Bar are also the same.

How do I avoid code duplication for Foo2 and Bar2?

Upvotes: 0

Views: 111

Answers (2)

t.niese
t.niese

Reputation: 40882

That is the reason why in many c++ libraries the classes itself have only a small amount of member functions, that are limited to what is really needed to describe the type. Every thing else is solved using free functions, this allows you to modular the the code and to reuse functionality in an better way.

So use only member functions to hide the implementation details of the data structure of the class or if virtual is needed, for everything else you most of the time want to use free functions.

An implementation might look something like that (that's just a proof of concept code to illustrate that usage of free functions):

#include <iostream>
#include <vector>

class Foo {
private: 
   std::vector<double> data = {1,2,3};
public:
   std::vector<double>::const_iterator begin() const {
      return data.begin();
   }

   std::vector<double>::const_iterator end() const {
      return data.end();
   }

   double first() const {
     return *begin();
   }
};

class Foo2 {
private: 
   double data = 42;

public:
   const double * begin() const {
      return &data;
   }

    const double * end() const {
      return &data + 1;
   }

   double first() const {
     return *begin();
   }
};

template<typename T>
double get_data_at_timepoint(const T &obj, size_t index) {
    auto it = obj.begin()+index;
    return *it;
}

template<typename T>
double get_data(const T &obj) {
    return obj.first();
}


int main()
{
    Foo f;
    Foo2 f2;

    double d = get_data_at_timepoint(f, 2);
    double d2 = get_data_at_timepoint(f2, 0);


    double d3 = get_data(f);
    double d4 = get_data(f2);
   // double d2 = get_data_at_timepoint(f2, 0);
    std::cout << "get_data_at_timepoint " << d << " " << d2 << std::endl;
    std::cout << "get_data " << d3 << " " << d4 << std::endl;
}

Now you can use your get_data_at_timepoint with any data type that supports an iterator with the type double and it is not limited to you class Foo at all.

If get_data_at_timepoint is special for one of those classes you could created a specialized version for just that class:

template<>
double get_data_at_timepoint<Foo>(const Foo &obj, size_t index) {
    // Foo related implementation
}

Upvotes: 0

Anonymous1847
Anonymous1847

Reputation: 2598

Do this:

template <typename FooType>
class BarBase {
private:
    std::vector<FooType> foos;
protected:
    virtual double get_data_from_foo(unsigned int, void*) = 0;
public:
    void do_something();
    // all other methods that used to be in Bar
};

class Bar : public BarBase<Foo> {
protected:
    virtual void get_data_from_foo(unsigned int id, void* time_ptr) {
        return foos[id].get_data_at_timepoint(*(timepoint_t*)time_ptr);
    }
};

class Bar2 : public BarBase<Foo2> {
protected:
    virtual void get_data_from_foo(unsigned int id, void* dummy) {
        return foos[id].get_data();
    }
};

You’ll have to call get_data_from_foo() inside BarBase::do_something(). You also have to calculate the time point and pass it to that function, regardless of whether it’s needed.

Alternatively, if you don’t mind the code duplication inside do_something(), remove get_data_from_foo() and add a do_something() member function to each of Bar and Bar2, defining them separately.

Upvotes: 1

Related Questions