winsmith
winsmith

Reputation: 21622

What is a clean "pythonic" way to implement multiple constructors?

I can't find a definitive answer for this. As far as I know, you can't have multiple __init__ functions in a Python class. So how do I solve this problem?

Suppose I have a class called Cheese with the number_of_holes property. How can I have two ways of creating cheese objects...

  1. One that takes a number of holes like this: parmesan = Cheese(num_holes=15).
  2. And one that takes no arguments and just randomizes the number_of_holes property: gouda = Cheese().

I can think of only one way to do this, but this seems clunky:

class Cheese:
    def __init__(self, num_holes=0):
        if num_holes == 0:
            # Randomize number_of_holes
        else:
            number_of_holes = num_holes

What do you say? Is there another way?

Upvotes: 974

Views: 495067

Answers (16)

user2138149
user2138149

Reputation: 17414

I don't think any of the answers here get it quite right, although some come close.

Many answers suggest something like the following:

  • Provide a "most general" __init__ function, which takes all possible arguments
  • __init__ should (in general) have some complex logic to check all the arguments for consistency, and then set member data depending on those arguments
  • Other "constructor functions" should have more specific combinations of arguments, and all of them should call __init__

I think this is the wrong design. Unfortunatly, the example given by OP is too simple to fully show why this is a bad design, as in this case the "cheese" type only takes a single integer value in all cases.

In order to realize why it is bad, we need to see a more complex example.

This is from something I am working on:

Using the above paradim this is what we end up writing:

class ExperimentRecord():

    def __init__(self, experiment_index=None, dictionary=None):

        if experiment_index is None and dictionary is None:
            raise ExperimentalDatabaseException(f'constructing instance of ExperimentalRecord requires either experiment_index or dictionary to be specified')
        elif experiment_index is not None and dictionary is not None:
            raise ExperimentalDatabaseException(f'constructing instance of ExperimentalRecoed requires either experiment_index or dictionary to be specified, but not both')
        elif experiment_index is None and dictionary is not None:
            self.experiment_index = dictionary['index']
            self.record_type = dictionary['record_type']
            self.data = dictionary['data']
            self.measurement_amplitude = dictionary['amplitude']
            self.measurement_mean = dictionary['mean']
            self.measurement_stddev = dictionary['stddev']
            self.measurement_log_likelihood = dictionary['log_likelihood']
        elif experiment_index is not None and dictionary is None:
            self.experiment_index = experiment_index
            self.record_type = None
            self.data = None
            self.measurement_amplitude = None
            self.measurement_mean = None
            self.measurement_stddev = None
            self.measurement_log_likelihood = None

The resulting code is, to put it bluntly (and I say this as the person who wrote this code), shockingly bad. These are the reasons why:

  • __init__ has to use complex combinatorial logic to validate the arguments
  • if the arguments form a valid combination, then it performs some extensive initialization, in the same function
  • this violates the single responsible principle and leads to complex code which is hard to maintain, or even understand
  • it can be improved by adding two functions __init_from_dictionary and __init_from_experimental_index but this leads to extra functions being added for really no purpose other than to try and keep the __init__ function managable
  • this is totally not how multiple constructors work in languages like Java, C++ or even Rust. Typically we expect function overloading to seperate out the logic for different ways of initializing something into totally independent functions. Here, we mixed everything into a single function, which is the exact opposite of what we want to achieve

Further, in this example, the initialization is dependent only on two variables. But I could have easily added a third:

  • For example, we might want to initialize an experimental record from a string or even a filename/path or file handle
  • We can imagine that the complexity explodes as more possible methods of initialization are introduced
  • In more complex cases, each argument might not be independent. We could imagine possible cases for initialization where valid initializations are formed from a subset of possible arguments, where the subsets overlap somehow in a complex way

For example:

Some object might take arguments A, B, C, D, E. It might be valid to initialize using the following combinations:

  • A
  • B, C, D
  • D, E
  • A, E

This is an abstract example, because it is hard to think of an simple example to present. However, if you have been around a while in the field of software engineering, you will know that such examples can and do sometimes arrise, regardless of whether their existance points to some shortcominings in the overall design.


With the above said, this is what I am working with, right now. It probably isn't perfect, I have only just started working with Python in a context which required me to write "multiple constructors" as of yesterday.

We fix the problems by doing the following:

  • make __init__ a "null" constructor. It should do the work of a constructor which takes no arguments
  • Add constructor functions which modify the object in some way after calling the null constructor (__init__)
  • Or, if the use case lends itself to inheritance, use an inheritance pattern as others have suggested. (This may or may not be "better" depending on the context)

Something like this, maybe

class ExperimentRecord():

    def __init__():
        self.experiment_index = None
        self.record_type = None
        self.data = None
        self.measurement_amplitude = None
        self.measurement_mean = None
        self.measurement_stddev = None
        self.measurement_log_likelihood = None

    @classmethod
    def from_experiment_index(cls, experiment_index):
        tmp = cls() # calls `__new__`, `__init__`, unless I misunderstand
        tmp.experiment_index = experiment_index
        return tmp

    @classmethod
    def from_dictionary(cls, dictionary):
        tmp = cls()
        tmp .experiment_index = dictionary['index']
        tmp .record_type = dictionary['record_type']
        tmp .data = dictionary['data']
        tmp .measurement_amplitude = dictionary['amplitude']
        tmp .measurement_mean = dictionary['mean']
        tmp .measurement_stddev = dictionary['stddev']
        tmp .measurement_log_likelihood = dictionary['log_likelihood']
        return tmp

With this design, we solve the following problems:

  • single responsibility principle: each constructor function is fully independent and does its own thing to initialize the object
  • each constructor function takes the arguments it requires for initialization, and nothing more. each possible method of initization requires its own set of arguments, and those sets of arguments are indepenent, and not mashed into one single function call

Note: Since I literally just thought of this, it's possible I have overlooked something. If that is the case please leave a comment explaining the deficiencies and I will try and think of a resolution, and then update the answer. This seems to work for my particular use case but there is always a possibility I have overlooked something, particularly as I didn't know have any need to investigate writing multiple Python constructors until today.

Upvotes: 1

vartec
vartec

Reputation: 134711

Actually None is much better for "magic" values:

class Cheese:
    def __init__(self, num_holes=None):
        if num_holes is None:
            ...

Now if you want complete freedom of adding more parameters:

class Cheese:
    def __init__(self, *args, **kwargs):
        # args -- tuple of anonymous arguments
        # kwargs -- dictionary of named arguments
        self.num_holes = kwargs.get('num_holes', random_holes())

To better explain the concept of *args and **kwargs (you can actually change these names):

def f(*args, **kwargs):
   print('args:', args, 'kwargs:', kwargs)

>>> f('a')
args: ('a',) kwargs: {}
>>> f(ar='a')
args: () kwargs: {'ar': 'a'}
>>> f(1,2,param=3)
args: (1, 2) kwargs: {'param': 3}

http://docs.python.org/reference/expressions.html#calls

Upvotes: 974

Stuart
Stuart

Reputation: 9868

Here (drawing on this earlier answer, the pure Python version of classmethod in the docs, and as suggested by this comment) is a decorator that can be used to create multiple constructors.

from types import MethodType
from functools import wraps

class constructor:
    def __init__(self, func):

        @wraps(func)                      
        def wrapped(cls, *args, **kwargs):
            obj = cls.__new__(cls)        # Create new instance but don't init
            super(cls, obj).__init__()    # Init any classes it inherits from
            func(obj, *args, **kwargs)    # Run the constructor with obj as self
            return obj                
        
        self.wrapped = wrapped

    def __get__(self, _, cls):
        return MethodType(self.wrapped, cls)   # Bind this constructor to the class 
        
    
class Test:
    def __init__(self, data_sequence):
        """ Default constructor, initiates with data sequence """
        self.data = [item ** 2 for item in data_sequence]
        
    @constructor
    def zeros(self, size):
        """ Initiates with zeros """
        self.data = [0 for _ in range(size)]
           
a = Test([1,2,3])
b = Test.zeros(100)

This seems the cleanest way in some cases (see e.g. multiple dataframe constructors in Pandas), where providing multiple optional arguments to a single constructor would be inconvenient: for example cases where it would require too many parameters, be unreadable, be slower or use more memory than needed. However, as earlier comments have argued, in most cases it is probably more Pythonic to route through a single constructor with optional parameters, adding class methods where needed.

Upvotes: 0

Ber
Ber

Reputation: 41873

Using num_holes=None as the default is fine if you are going to have just __init__.

If you want multiple, independent "constructors", you can provide these as class methods. These are usually called factory methods. In this case you could have the default for num_holes be 0.

class Cheese(object):
    def __init__(self, num_holes=0):
        "defaults to a solid cheese"
        self.number_of_holes = num_holes

    @classmethod
    def random(cls):
        return cls(randint(0, 100))

    @classmethod
    def slightly_holey(cls):
        return cls(randint(0, 33))

    @classmethod
    def very_holey(cls):
        return cls(randint(66, 100))

Now create object like this:

gouda = Cheese()
emmentaler = Cheese.random()
leerdammer = Cheese.slightly_holey()

Upvotes: 1002

Tim C.
Tim C.

Reputation: 179

I do not see a straightforward answer with an example yet. The idea is simple:

  • use __init__ as the "basic" constructor as python only allows one __init__ method
  • use @classmethod to create any other constructors and call the basic constructor

Here is a new try.

 class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    @classmethod
    def fromBirthYear(cls, name, birthYear):
        return cls(name, date.today().year - birthYear)

Usage:

p = Person('tim', age=18)
p = Person.fromBirthYear('tim', birthYear=2004)

Upvotes: 1

teichert
teichert

Reputation: 4713

Overview

For the specific cheese example, I agree with many of the other answers about using default values to signal random initialization or to use a static factory method. However, there may also be related scenarios that you had in mind where there is value in having alternative, concise ways of calling the constructor without hurting the quality of parameter names or type information.

Since Python 3.8 and functools.singledispatchmethod can help accomplish this in many cases (and the more flexible multimethod can apply in even more scenarios). (This related post describes how one could accomplish the same in Python 3.4 without a library.) I haven't seen examples in the documentation for either of these that specifically shows overloading __init__ as you ask about, but it appears that the same principles for overloading any member method apply (as shown below).

"Single dispatch" (available in the standard library) requires that there be at least one positional parameter and that the type of the first argument be sufficient to distinguish among the possible overloaded options. For the specific Cheese example, this doesn't hold since you wanted random holes when no parameters were given, but multidispatch does support the very same syntax and can be used as long as each method version can be distinguish based on the number and type of all arguments together.

Example

Here is an example of how to use either method (some of the details are in order to please mypy which was my goal when I first put this together):

from functools import singledispatchmethod as overload
# or the following more flexible method after `pip install multimethod`
# from multimethod import multidispatch as overload


class MyClass:

    @overload  # type: ignore[misc]
    def __init__(self, a: int = 0, b: str = 'default'):
        self.a = a
        self.b = b

    @__init__.register
    def _from_str(self, b: str, a: int = 0):
        self.__init__(a, b)  # type: ignore[misc]

    def __repr__(self) -> str:
        return f"({self.a}, {self.b})"


print([
    MyClass(1, "test"),
    MyClass("test", 1),
    MyClass("test"),
    MyClass(1, b="test"),
    MyClass("test", a=1),
    MyClass("test"),
    MyClass(1),
    # MyClass(),  # `multidispatch` version handles these 3, too.
    # MyClass(a=1, b="test"),
    # MyClass(b="test", a=1),
])

Output:

[(1, test), (1, test), (0, test), (1, test), (1, test), (0, test), (1, default)]

Notes:

  • I wouldn't usually make the alias called overload, but it helped make the diff between using the two methods just a matter of which import you use.
  • The # type: ignore[misc] comments are not necessary to run, but I put them in there to please mypy which doesn't like decorating __init__ nor calling __init__ directly.
  • If you are new to the decorator syntax, realize that putting @overload before the definition of __init__ is just sugar for __init__ = overload(the original definition of __init__). In this case, overload is a class so the resulting __init__ is an object that has a __call__ method so that it looks like a function but that also has a .register method which is being called later to add another overloaded version of __init__. This is a bit messy, but it please mypy becuase there are no method names being defined twice. If you don't care about mypy and are planning to use the external library anyway, multimethod also has simpler alternative ways of specifying overloaded versions.
  • Defining __repr__ is simply there to make the printed output meaningful (you don't need it in general).
  • Notice that multidispatch is able to handle three additional input combinations that don't have any positional parameters.

Upvotes: 16

Elmex80s
Elmex80s

Reputation: 3504

This is how I solved it for a YearQuarter class I had to create. I created an __init__ which is very tolerant to a wide variety of input.

You use it like this:

>>> from datetime import date
>>> temp1 = YearQuarter(year=2017, month=12)
>>> print temp1
2017-Q4
>>> temp2 = YearQuarter(temp1)
>>> print temp2
2017-Q4
>>> temp3 = YearQuarter((2017, 6))
>>> print temp3
2017-Q2 
>>> temp4 = YearQuarter(date(2017, 1, 18))
>>> print temp4
2017-Q1
>>> temp5 = YearQuarter(year=2017, quarter = 3)
>>> print temp5
2017-Q3

And this is how the __init__ and the rest of the class looks like:

import datetime


class YearQuarter:

    def __init__(self, *args, **kwargs):
        if len(args) == 1:
            [x]     = args

            if isinstance(x, datetime.date):
                self._year      = int(x.year)
                self._quarter   = (int(x.month) + 2) / 3
            elif isinstance(x, tuple):
                year, month     = x

                self._year      = int(year)

                month           = int(month)

                if 1 <= month <= 12:
                    self._quarter   = (month + 2) / 3
                else:
                    raise ValueError

            elif isinstance(x, YearQuarter):
                self._year      = x._year
                self._quarter   = x._quarter

        elif len(args) == 2:
            year, month     = args

            self._year      = int(year)

            month           = int(month)

            if 1 <= month <= 12:
                self._quarter   = (month + 2) / 3
            else:
                raise ValueError

        elif kwargs:

            self._year      = int(kwargs["year"])

            if "quarter" in kwargs:
                quarter     = int(kwargs["quarter"])

                if 1 <= quarter <= 4:
                    self._quarter     = quarter
                else:
                    raise ValueError
            elif "month" in kwargs:
                month   = int(kwargs["month"])

                if 1 <= month <= 12:
                    self._quarter     = (month + 2) / 3
                else:
                    raise ValueError

    def __str__(self):
        return '{0}-Q{1}'.format(self._year, self._quarter)

Upvotes: 3

Andrzej Pronobis
Andrzej Pronobis

Reputation: 36146

One should definitely prefer the solutions already posted, but since no one mentioned this solution yet, I think it is worth mentioning for completeness.

The @classmethod approach can be modified to provide an alternative constructor which does not invoke the default constructor (__init__). Instead, an instance is created using __new__.

This could be used if the type of initialization cannot be selected based on the type of the constructor argument, and the constructors do not share code.

Example:

class MyClass(set):

    def __init__(self, filename):
        self._value = load_from_file(filename)

    @classmethod
    def from_somewhere(cls, somename):
        obj = cls.__new__(cls)  # Does not call __init__
        super(MyClass, obj).__init__()  # Don't forget to call any polymorphic base class initializers
        obj._value = load_from_somewhere(somename)
        return obj

Upvotes: 83

Alexey
Alexey

Reputation: 4071

Since my initial answer was criticised on the basis that my special-purpose constructors did not call the (unique) default constructor, I post here a modified version that honours the wishes that all constructors shall call the default one:

class Cheese:
    def __init__(self, *args, _initialiser="_default_init", **kwargs):
        """A multi-initialiser.
        """
        getattr(self, _initialiser)(*args, **kwargs)

    def _default_init(self, ...):
        """A user-friendly smart or general-purpose initialiser.
        """
        ...

    def _init_parmesan(self, ...):
        """A special initialiser for Parmesan cheese.
        """
        ...

    def _init_gouda(self, ...):
        """A special initialiser for Gouda cheese.
        """
        ...

    @classmethod
    def make_parmesan(cls, *args, **kwargs):
        return cls(*args, **kwargs, _initialiser="_init_parmesan")

    @classmethod
    def make_gouda(cls, *args, **kwargs):
        return cls(*args, **kwargs, _initialiser="_init_gouda")

Upvotes: 3

Alexey
Alexey

Reputation: 4071

class Cheese:
    def __init__(self, *args, **kwargs):
        """A user-friendly initialiser for the general-purpose constructor.
        """
        ...

    def _init_parmesan(self, *args, **kwargs):
        """A special initialiser for Parmesan cheese.
        """
        ...

    def _init_gauda(self, *args, **kwargs):
        """A special initialiser for Gauda cheese.
        """
        ...

    @classmethod
    def make_parmesan(cls, *args, **kwargs):
        new = cls.__new__(cls)
        new._init_parmesan(*args, **kwargs)
        return new

    @classmethod
    def make_gauda(cls, *args, **kwargs):
        new = cls.__new__(cls)
        new._init_gauda(*args, **kwargs)
        return new

Upvotes: 1

Michel Samia
Michel Samia

Reputation: 4487

I'd use inheritance. Especially if there are going to be more differences than number of holes. Especially if Gouda will need to have different set of members then Parmesan.

class Gouda(Cheese):
    def __init__(self):
        super(Gouda).__init__(num_holes=10)


class Parmesan(Cheese):
    def __init__(self):
        super(Parmesan).__init__(num_holes=15) 

Upvotes: 4

Brad C
Brad C

Reputation: 720

Those are good ideas for your implementation, but if you are presenting a cheese making interface to a user. They don't care how many holes the cheese has or what internals go into making cheese. The user of your code just wants "gouda" or "parmesean" right?

So why not do this:

# cheese_user.py
from cheeses import make_gouda, make_parmesean

gouda = make_gouda()
paremesean = make_parmesean()

And then you can use any of the methods above to actually implement the functions:

# cheeses.py
class Cheese(object):
    def __init__(self, *args, **kwargs):
        #args -- tuple of anonymous arguments
        #kwargs -- dictionary of named arguments
        self.num_holes = kwargs.get('num_holes',random_holes())

def make_gouda():
    return Cheese()

def make_paremesean():
    return Cheese(num_holes=15)

This is a good encapsulation technique, and I think it is more Pythonic. To me this way of doing things fits more in line more with duck typing. You are simply asking for a gouda object and you don't really care what class it is.

Upvotes: 19

mluebke
mluebke

Reputation: 8819

The best answer is the one above about default arguments, but I had fun writing this, and it certainly does fit the bill for "multiple constructors". Use at your own risk.

What about the new method.

"Typical implementations create a new instance of the class by invoking the superclass’s new() method using super(currentclass, cls).new(cls[, ...]) with appropriate arguments and then modifying the newly-created instance as necessary before returning it."

So you can have the new method modify your class definition by attaching the appropriate constructor method.

class Cheese(object):
    def __new__(cls, *args, **kwargs):

        obj = super(Cheese, cls).__new__(cls)
        num_holes = kwargs.get('num_holes', random_holes())

        if num_holes == 0:
            cls.__init__ = cls.foomethod
        else:
            cls.__init__ = cls.barmethod

        return obj

    def foomethod(self, *args, **kwargs):
        print "foomethod called as __init__ for Cheese"

    def barmethod(self, *args, **kwargs):
        print "barmethod called as __init__ for Cheese"

if __name__ == "__main__":
    parm = Cheese(num_holes=5)

Upvotes: 9

Yes - that Jake.
Yes - that Jake.

Reputation: 17129

All of these answers are excellent if you want to use optional parameters, but another Pythonic possibility is to use a classmethod to generate a factory-style pseudo-constructor:

def __init__(self, num_holes):

  # do stuff with the number

@classmethod
def fromRandom(cls):

  return cls( # some-random-number )

Upvotes: 31

Ferdinand Beyer
Ferdinand Beyer

Reputation: 67217

Why do you think your solution is "clunky"? Personally I would prefer one constructor with default values over multiple overloaded constructors in situations like yours (Python does not support method overloading anyway):

def __init__(self, num_holes=None):
    if num_holes is None:
        # Construct a gouda
    else:
        # custom cheese
    # common initialization

For really complex cases with lots of different constructors, it might be cleaner to use different factory functions instead:

@classmethod
def create_gouda(cls):
    c = Cheese()
    # ...
    return c

@classmethod
def create_cheddar(cls):
    # ...

In your cheese example you might want to use a Gouda subclass of Cheese though...

Upvotes: 21

Devin Jeanpierre
Devin Jeanpierre

Reputation: 95616

Use num_holes=None as a default, instead. Then check for whether num_holes is None, and if so, randomize. That's what I generally see, anyway.

More radically different construction methods may warrant a classmethod that returns an instance of cls.

Upvotes: 10

Related Questions