Marc Claesen
Marc Claesen

Reputation: 17026

Elegant way to implement extensible factories in C++

I am looking for an intuitive and extensible way to implement factories for subclasses of a given base class in . I want to provide such a factory function in a library.The tricky part is that I want said factory to work for user-defined subclasses as well (e.g. having the library's factory function build different subclasses depending on what modules are linked to it). The goal is to have minimal burden/confusion for downstream developers to use the factories.

An example of what I want to do is: given a std::istream, construct and return an object of whatever subclass matches the content, or a null pointer if no matches are found. The global factory would have a signature like:

Base* Factory(std::istream &is){ ... };

I am familiar with prototype factories, but I prefer to avoid the need to make/store prototype objects. A related question is posted here for : Allowing maximal flexibly/extensibility using a factory.

I am not looking for -specific solutions at the moment, but if they are more elegant I would be happy to learn about those.

I came up with one working solution which I believe is fairly elegant, which I will post as an answer. I can imagine this problem to be fairly common, so I am wondering if anyone knows of better approaches.

EDIT: it seems some clarification is in order...

The idea is for the factory to construct an object of a derived class, without containing the logic to decide which one. To make matters worse, the factory method will end up as part of a library and derived classes may be defined in plugins.

Derived classes must be able to decide for themselves whether or not they are fit for construction, based on the input provided (for example an input file). This decision can be implemented as a predicate that can be used by the factory, as was suggested by several people (great suggestion, by the way!).

Upvotes: 25

Views: 9326

Answers (7)

Suedocode
Suedocode

Reputation: 2524

If I understand this correctly, we want a factory function that can select which derived class to instantiate based on constructor inputs. This is the most generic solution that I could come up with so far. You specify mapping inputs to organize factory functions, and then you can specify constructor inputs upon factory invocation. I hate to say that the code explains more than I could in words, however I think the example implementations of FactoryGen.h in Base.h and Derived.h are clear enough with the help of comments. I can provide more details if necessary.

FactoryGen.h

#pragma once

#include <map>
#include <tuple>
#include <typeinfo>

//C++11 typename aliasing, doesn't work in visual studio though...
/*
template<typename Base>
using FactoryGen<Base> = FactoryGen<Base,void>;
*/

//Assign unique ids to all classes within this map.  Better than typeid(class).hash_code() since there is no computation during run-time.
size_t __CLASS_UID = 0;

template<typename T>
inline size_t __GET_CLASS_UID(){
    static const size_t id = __CLASS_UID++;
    return id;
}

//These are the common code snippets from the factories and their specializations. 
template<typename Base>
struct FactoryGenCommon{
    typedef std::pair<void*,size_t> Factory; //A factory is a function pointer and its unique type identifier

    //Generates the function pointer type so that I don't have stupid looking typedefs everywhere
    template<typename... InArgs>
    struct FPInfo{ //stands for "Function Pointer Information"
        typedef Base* (*Type)(InArgs...);
    };

    //Check to see if a Factory is not null and matches it's signature (helps make sure a factory actually takes the specified inputs)
    template<typename... InArgs>
    static bool isValid(const Factory& factory){
        auto maker = factory.first;
        if(maker==nullptr) return false;

        //we have to check if the Factory will take those inArgs
        auto type = factory.second;
        auto intype = __GET_CLASS_UID<FPInfo<InArgs...>>();
        if(intype != type) return false;

        return true;
    }
};

//template inputs are the Base type for which the factory returns, and the Args... that will determine how the function pointers are indexed.
template<typename Base, typename... Args> 
struct FactoryGen : FactoryGenCommon<Base>{
    typedef std::tuple<Args...> Tuple;
    typedef std::map<Tuple,Factory> Map; //the Args... are keys to a map of function pointers

    inline static Map& get(){ 
        static Map factoryMap;
        return factoryMap; 
    }

    template<typename... InArgs>
    static void add(void* factory, const Args&... args){
        Tuple selTuple = std::make_tuple(args...); //selTuple means Selecting Tuple.  This Tuple is the key to the map that gives us a function pointer
        get()[selTuple] = Factory(factory,__GET_CLASS_UID<FPInfo<InArgs...>>());
    }

    template<typename... InArgs>
    static Base* make(const Args&... args, const InArgs&... inArgs){
        Factory factory = get()[std::make_tuple(args...)];
        if(!isValid<InArgs...>(factory)) return nullptr;
        return ((FPInfo<InArgs...>::Type)factory.first) (inArgs...);
    }
};

//Specialize for factories with no selection mapping
template<typename Base>
struct FactoryGen<Base,void> : FactoryGenCommon<Base>{
    inline static Factory& get(){
        static Factory factory;
        return factory; 
    }

    template<typename... InArgs>
    static void add(void* factory){
        get() = Factory(factory,__GET_CLASS_UID<FPInfo<InArgs...>>());
    }

    template<typename... InArgs>
    static Base* make(const InArgs&... inArgs){
        Factory factory = get();
        if(!isValid<InArgs...>(factory)) return nullptr;
        return ((FPInfo<InArgs...>::Type)factory.first) (inArgs...);
    }
};

//this calls the function "initialize()" function to register each class ONCE with the respective factory (even if a class tries to initialize multiple times)
//this step can probably be circumvented, but I'm not totally sure how
template <class T>
class RegisterInit {
  int& count(void) { static int x = 0; return x; } //counts the number of callers per derived
public:
  RegisterInit(void) { 
    if ((count())++ == 0) { //only initialize on the first caller of that class T
      T::initialize();
    }
  }
};

Base.h

#pragma once

#include <map>
#include <string>
#include <iostream>
#include "Procedure.h"
#include "FactoryGen.h"

class Base {
public:
    static Base* makeBase(){ return new Base; }
    static void initialize(){ FactoryGen<Base,void>::add(Base::makeBase); } //we want this to be the default mapping, specify that it takes void inputs

    virtual void speak(){ std::cout << "Base" << std::endl; }
};

RegisterInit<Base> __Base; //calls initialize for Base

Derived.h

#pragma once

#include "Base.h"

class Derived0 : public Base {
private:
    std::string speakStr;
public:
    Derived0(std::string sayThis){ speakStr=sayThis; }

    static Base* make(std::string sayThis){ return new Derived0(sayThis); }
    static void initialize(){ FactoryGen<Base,int>::add<std::string>(Derived0::make,0); } //we map to this subclass via int with 0, but specify that it takes a string input

    virtual void speak(){ std::cout << speakStr << std::endl; }
};

RegisterInit<Derived0> __d0init; //calls initialize() for Derived0

class Derived1 : public Base {
private:
    std::string speakStr;
public:
    Derived1(std::string sayThis){ speakStr=sayThis; }

    static Base* make(std::string sayThat){ return new Derived0(sayThat); }
    static void initialize(){ FactoryGen<Base,int>::add<std::string>(Derived0::make,1); } //we map to this subclass via int with 1, but specify that it takes a string input

    virtual void speak(){ std::cout << speakStr << std::endl; }
};

RegisterInit<Derived1> __d1init; //calls initialize() for Derived1

Main.cpp

#include <windows.h> //for Sleep()
#include "Base.h"
#include "Derived.h"

using namespace std;

int main(){
    Base* b = FactoryGen<Base,void>::make(); //no mapping, no inputs
    Base* d0 = FactoryGen<Base,int>::make<string>(0,"Derived0"); //int mapping, string input
    Base* d1 = FactoryGen<Base,int>::make<string>(1,"I am Derived1"); //int mapping, string input

    b->speak();
    d0->speak();
    d1->speak();

    cout << "Size of Base: " << sizeof(Base) << endl;
    cout << "Size of Derived0: " << sizeof(Derived0) << endl;

    Sleep(3000); //Windows & Visual Studio, sry
}

I think this is a pretty flexible/extensible factory library. While the code for it is not very intuitive, I think using it is fairly simple. Of course, my view is biased seeing as I'm the one that wrote it, so please let me know if it is the contrary.

EDIT : Cleaned up the FactoryGen.h file. This is probably my last update, however this has been a fun exercise.

Upvotes: 11

Speed8ump
Speed8ump

Reputation: 1317

First, there's not really enough detail here to form an opinion, so I'm left to guess. You've provided a challenging question and a minimal solution, but not clarified what is wrong with your solution.

I suspect the complaint centers around the reset back to knowing nothing between a refused construction and the following construction attempts. Given a very large number of potential factories this reset could have us parsing the same data hundreds or thousands of times. If this is the problem the question is this: how do you structure the predicate evaluation phase to limit the amount of work, and allow it to reuse previous parsing results.

I suggest having each factory register with: 1) a factory builder function taking the specialization parameter(s) (iostream in the example) 2) an unordered set of boolean predicates 3) required boolean values of each predicate to allow construction

The set of predicates is used to create/modify the predicate tree. Interior nodes in the tree represent predicates (branching to 'pass', 'fail', and possibly 'don't care'). Both interior nodes and leaves hold constructors which are satisfied if the ancestral predicates are satisfied. As you traverse the tree you first look for constructors at the current level, then evaluate the predicate and follow the required path. If no solution is found along that child path the follow the 'don't care' path.

This allows new factories to share predicate functions. There's probably lots of questions about managing/sorting the tree when the factories go on/off line. There's also the possibility of parser state data that needs to be retained across predicates and reset when construction is completed. There's lots of open questions, but this may work toward addressing the perceived problems with your solution.

TL:DR; Create a graph of predicates to traverse when attempting construction.

Upvotes: 1

Tom Kerr
Tom Kerr

Reputation: 10720

Here is a sustainable idiom for managing factories that resolve at runtime. I've used this in the past to support fairly sophisticated behavior. I favor simplicity and maintainability without giving up much in the way of functionality.

TLDR:

  • Avoid static initialization in general
  • Avoid "auto-loading" techniques like the plague
  • Communicate ownership of objects AND factories
  • Separate usage and factory management concerns

Using Runtime Factories

Here is the base interface that users of this factory system will interact with. They shouldn't need to worry about the details of the factory.

class BaseObject {
public:
    virtual ~BaseObject() {}
};

BaseObject* CreateObjectFromStream(std::istream& is);

As an aside, I would recommend using references, boost::optional, or shared_ptr instead of raw pointers. In a perfect world, the interface should tell me who owns this object. As a user, am I responsible for deleting this pointer when it's given to me? It's painfully clear when it's a shared_ptr.

Implementing Runtime Factories

In another header, put the details of managing the scope of when the factories are active.

class RuntimeFactory {
public:
    virtual BaseObject* create(std::istream& is) = 0;
};

void RegisterRuntimeFactory(RuntimeFactory* factory);
void UnregisterRuntimeFactory(RuntimeFactory* factory);

I think the salient point in all of this is that usage is a different concern from how the factories are initialized and used.

We should note that the callers of these free functions own the factories. The registry does not own them.

This isn't strictly necessary, though it offers more control when and where these factories get destroyed. The point where it matters is when you see things like "post-create" or "pre-destroy" calls. Factory methods with these sorts of names are design smells for ownership inversion.

Writing another wrapper around this to manage the factories life-time would be simple enough anyway. It also lends to composition, which is better.

Registering Your New Factory

Write wrappers for each factory registration. I usually put each factory registration in its own header. These headers are usually just two function calls.

void RegisterFooFactory();
void UnregisterFooFactory();

This may seem like overkill, but this sort of diligence keeps your compile times down.

My main then is reduced to a bunch of register and unregister calls.

#include <foo_register.h>
#include <bar_register.h>

int main(int argc, char* argv[]) {
    SetupLogging();
    SetupRuntimeFactory();
    RegisterFooFactory();
    RegisterBarFactory();

    // do work...

    UnregisterFooFactory();
    UnregisterBarFactory();
    CleanupLogging();
    return 0;
}

Avoid Static Init Pitfalls

This specifically avoids objects created during static loading like some of the other solutions. This is not an accident.

  • The C++ spec won't give you useful assurances about when static loading will occur
  • You'll get a stack trace when something goes wrong
  • The code is simple, direct, easy to follow

Implementing the Registry

Implementation details are fairly mundane, as you'd imagine.

class RuntimeFactoryRegistry {
public:
    void registerFactory(RuntimeFactory* factory) {
        factories.insert(factory);
    }

    void unregisterFactory(RuntimeFactory* factory) {
        factories.erase(factory);
    }

    BaseObject* create(std::istream& is) {
        std::set<RuntimeFactory*>::iterator cur = factories.begin();
        std::set<RuntimeFactory*>::iterator end = factories.end();
        for (; cur != end; cur++) {
            // reset input?
            if (BaseObject* obj = (*cur)->create(is)) {
                return obj;
            }
        }
        return 0;
    }

private:
    std::set<RuntimeFactory*> factories;
};

This assumes that all factories are mutually exclusive. Relaxing this assumption is unlikely to result in well-behaving software. I'd probably make stronger claims in person, hehe. Another alternative would be to return a list of objects.

The below implementation is static for simplicity of demonstration. This can be a problem for multi-threaded environments. It doesn't have to be static, nor do I recommend it should or shouldn't be static, it just is here. It isn't really the subject of the discussion, so I'll leave it at that.

These free functions only act as pass-through functions for this implementation. This lets you unit test the registry or reuse it if you were so inclined.

namespace {

    static RuntimeFactoryRegistry* registry = 0;

} // anon    

void SetupRuntimeFactory() {
    registry = new RuntimeFactoryRegistry;
}

void CleanupRuntimeFactory() {
    delete registry;
    registry = 0;
}

BaseObject* CreateObjectFromStream(std::istream& is) {
    return registry->create(is);
}

void RegisterRuntimeFactory(RuntimeFactory* factory) {
    registry->registerFactory(factory);
}

void UnregisterRuntimeFactory(RuntimeFactory* factory) {
    registry->unregisterFactory(factory);
}

Upvotes: 3

tp1
tp1

Reputation: 1207

Simple solution is just a switch-case:

Base *create(int type, std::string data) {
  switch(type) { 
   case 0: return new Derived1(data); 
   case 1: return new Derived2(data);
  };
}

But then it's just deciding which type you want:

   int type_of_obj(string s) {
      int type = -1;
      if (isderived1(s)) type=0;
      if (isderived2(s)) type=1;
      return type;
   } 

Then it's just connecting the two:

Base *create_obj(string s, string data, 
                 Base *(*fptr)(int type, string data), 
                 int (*fptr2)(string s)) 
{
   int type = fptr2(s);
   if (type==-1) return 0;
   return fptr(type, data);   
}

Then it's just registering the function pointers:

   class Registry {
   public:
       void push_back(Base* (*fptr)(int type, string data),
                      int (*fptr2)(string s));
       Base *create(string s, string data);
   };

The plugin will have the 2 functions, and the following:

void register_classes(Registry &reg) {
    reg.push_back(&create, &type_of_obj);
    ...
}

Plugin loader will dlopen/dlsym the register_classes functions.

(on the other hand, I'm not using this kind of plugins myself because creating new plugins is too much work. I have better way to provide modularity for my program's pieces. What kills plugins is the fact that you need to modify your build system to create new dll's or shared_libs, and doing that is just too much work - ideally new module is just one class; without anything more complicated build system modifications)

Upvotes: 0

icabod
icabod

Reputation: 7074

Update: This answer made the assumption that some kind of magic existed that could be read and passed to the factory, but that's apparently not the case. I'm leaving the answer here because a) I may update it, and b) I like it anyway.


Not hugely different from your own answer, not using C++11 techniques (I've not had a chance to update it yet, or have it return a smart pointer, etc), and not entirely my own work, but this is the factory class I use. Importantly (IMHO) it doesn't call each possible class's methods to find the one that matches - it does this via the map.

#include <map>
// extraneous code has been removed, such as empty constructors, ...
template <typename _Key, typename _Base, typename _Pred = std::less<_Key> >
class Factory {
public:
    typedef _Base* (*CreatorFunction) (void);
    typedef std::map<_Key, CreatorFunction, _Pred> _mapFactory;

    // called statically by all classes that can be created
    static _Key Register(_Key idKey, CreatorFunction classCreator) {
        get_mapFactory()->insert(std::pair<_Key, CreatorFunction>(idKey, classCreator));
        return idKey;
    }

    // Tries to create instance based on the key
    static _Base* Create(_Key idKey) {
        _mapFactory::iterator it = get_mapFactory()->find(idKey);
        if (it != get_mapFactory()->end()) {
            if (it->second) {
                return it->second();
            }
        }
        return 0;
    }

protected:
    static _mapFactory * get_mapFactory() {
        static _mapFactory m_sMapFactory;
        return &m_sMapFactory;
    }
};

To use this you just declare the base-type, and for each class you register it as a static. Note that when you register, the key is returned, so I tend to add this as a member of the class, but it's not necessary, just neat :) ...

// shape.h
// extraneous code has been removed, such as empty constructors, ...
// we also don't technically need the id() method, but it could be handy
// if at a later point you wish to query the type.
class Shape {
public:
    virtual std::string id() const = 0;
};
typedef Factory<std::string, Shape> TShapeFactory;

Now we can create a new derived class, and register it as creatable by TShapeFactory...

// cube.h
// extraneous code has been removed, such as empty constructors, ...
class Cube : public Shape {
protected:
    static const std::string _id;
public:
    static Shape* Create() {return new Cube;}
    virtual std::string id() const {return _id;};
};

// cube.cpp
const std::string Cube::_id = TShapeFactory::Register("cube", Cube::Create);

Then we can create a new item based on, in this case, a string:

Shape* a_cube = TShapeFactory::Create("cube");
Shape* a_triangle = TShapeFactory::Create("triangle");
// a_triangle is a null pointer, as we've not registered a "triangle"

The advantage of this method is that if you create a new derived, factory-generatable class, you don't need to change any other code, providing you can see the factory class and derive from the base:

// sphere.h
// extraneous code has been removed, such as empty constructors, ...
class Sphere : public Shape {
protected:
    static const std::string _id;
public:
    static Shape* Create() {return new Sphere;}
    virtual std::string id() const {return _id;};
};

// sphere.cpp
const std::string Sphere::_id = TShapeFactory::Register("sphere", Sphere::Create);

Possible improvements that I'll leave to the reader include adding things like: typedef _Base base_class to Factory, so that when you've declared your custom factory, you can make your classes derive from TShapeFactory::base_class, and so on. The Factory should probably also check if a key already exists, but again... it's left as an exercise.

Upvotes: 4

Aleph
Aleph

Reputation: 1219

My comments were probably not very clear. So here is a C++11 "solution" relying on template meta programming : (Possibly not the nicest way of doing this though)

#include <iostream>
#include <utility>


// Type list stuff: (perhaps use an existing library here)
class EmptyType {};

template<class T1, class T2 = EmptyType>
struct TypeList
{
    typedef T1 Head;
    typedef T2 Tail;
};

template<class... Etc>
struct MakeTypeList;

template <class Head>
struct MakeTypeList<Head>
{
    typedef TypeList<Head> Type;
};

template <class Head, class... Etc>
struct MakeTypeList<Head, Etc...>
{
    typedef TypeList<Head, typename MakeTypeList<Etc...>::Type > Type;
};

// Calling produce
template<class TList, class BaseType>
struct Producer;

template<class BaseType>
struct Producer<EmptyType, BaseType>
{
    template<class... Args>
    static BaseType* Produce(Args... args)
    {
        return nullptr;
    }
};

template<class Head, class Tail, class BaseType>
struct Producer<TypeList<Head, Tail>, BaseType>
{
    template<class... Args>
    static BaseType* Produce(Args... args)
    {
        BaseType* b = Head::Produce(args...);
        if(b != nullptr)
            return b;
        return Producer<Tail, BaseType>::Produce(args...);
    }
};

// Generic AbstractFactory:
template<class BaseType, class Types>
struct AbstractFactory {
    typedef Producer<Types, BaseType> ProducerType;

    template<class... Args>
    static BaseType* Produce(Args... args)
    {
        return ProducerType::Produce(args...);
    }
};

class Base {}; // Example base class you had

struct Derived0 : public Base { // Example derived class you had
    Derived0() = default;
    static Base* Produce(int value)
    {
        if(value == 0)
            return new Derived0();
        return nullptr;
    }
};

struct Derived1 : public Base { // Another example class
    Derived1() = default;
    static Base* Produce(int value)
    {
        if(value == 1)
            return new Derived1();
        return nullptr;
    }
};

int main()
{
    // This will be our abstract factory type:
    typedef AbstractFactory<Base, MakeTypeList<Derived0, Derived1>::Type> Factory;
    Base* b1 = Factory::Produce(1);
    Base* b0 = Factory::Produce(0);
    Base* b2 = Factory::Produce(2);
    // As expected b2 is nullptr
    std::cout << b0 << ", " << b1 << ", " << b2 << std::endl;
}

Advantages:

  1. No (additional) run-time overhead as you would have with the function pointers. Works for any base type, and for any number of derived types. You still end up calling the functions of course.
  2. Thanks to variadic templates this works with any number of arguments (giving an incorrect number of arguments will produce a compile-time error message).
  3. Explicit registering of the produce member functions is not required.

Disadvantages:

  1. All of your derived types must be available when you declare the Factory type. (You must know what the possible derived types are and they must be complete.)
  2. The produce member functions for the derived types must be public.
  3. Can make compilation slower. (As always the case when relying on template metaprogramming)

In the end, using the prototype design pattern might turn out better. I don't know since I haven't tried using my code.

I'd like to state some additional things (after further discussion on the chat):

  • Each factory can only return a single object. This seems strange, as the users decide whether they will take the input to create their object or not. I would for that reason suggest your factory can return a collection of objects instead.
  • Be careful not to overcomplicate things. You want a plugin system, but I don't think you really want factories. I would propose you simply make users register their classes (in their shared object), and that you simply pass the arguments to the classes' Produce (static) member functions. You store the objects if and only if they're not the nullptr.

Upvotes: 6

Marc Claesen
Marc Claesen

Reputation: 17026

The best solution I can currently think of is by using a Factory class which stores pointers to producing functions for each derived class. When a new derived class is made, a function pointer to a producing method can be stored in the factory.

Here is some code to illustrate my approach:

#include <iostream>
#include <vector>

class Base{};

// Factory class to produce Base* objects from an int (for simplicity).
// The class uses a list of registered function pointers, which attempt
// to produce a derived class based on the given int.
class Factory{
public:
    typedef Base*(*ReadFunPtr)(int);
private:
    static vector<ReadFunPtr> registeredFuns;
public:
    static void registerPtr(ReadFunPtr ptr){ registeredFuns.push_back(ptr); }
    static Base* Produce(int value){
        Base *ptr=NULL;
        for(vector<ReadFunPtr>::const_iterator I=registeredFuns.begin(),E=registeredFuns.end();I!=E;++I){
            ptr=(*I)(value);
            if(ptr!=NULL){
                return ptr;
            }
        }
        return NULL;
    }
};
// initialize vector of funptrs
std::vector<Factory::ReadFunPtr> Factory::registeredFuns=std::vector<Factory::ReadFunPtr>();

// An example Derived class, which can be produced from an int=0. 
// The producing method is static to avoid the need for prototype objects.
class Derived : public Base{
    private:
        static Base* ProduceDerivedFromInt(int value){ 
            if(value==0) return new Derived();
            return NULL;
        }
public:
    Derived(){};

    // registrar is a friend because we made the producing function private
    // this is not necessary, may be desirable (e.g. encapsulation)
    friend class DerivedRegistrar;
};

// Register Derived in the Factory so it will attempt to construct objects.
// This is done by adding the function pointer Derived::ProduceDerivedFromInt
// in the Factory's list of registered functions.
struct DerivedRegistrar{ 
    DerivedRegistrar(){ 
        Factory::registerPtr(&(Derived::ProduceDerivedFromInt));
    }
} derivedregistrar;

int main(){
    // attempt to produce a Derived object from 1: should fail
    Base* test=Factory::Produce(1);
    std::cout << test << std::endl; // outputs 0

    // attempt to produce a Derived object from 0: works
    test=Factory::Produce(0);
    std::cout << test << std::endl; // outputs an address
}

TL;DR: in this approach, downstream developers need to implement the producing function of a derived class as a static member function (or a non-member function) and register it in the factory using a simple struct.

This seems simple enough and does not require any prototype objects.

Upvotes: 3

Related Questions