primelens
primelens

Reputation: 457

The most 'Pythonic' way to handle overloading

Disclaimer: this is perhaps a quite subjective question with no 'right' answer but I'd appreciate any feedback on best-practices and program design. So here goes:

I am writing a library where text files are read into Text objects. Now these might be initialized with a list of file-names or directly with a list of Sentence objects. I am wondering what the best / most Pythonic way to do this might be because, if I understand correctly, Python doesn't directly support method overloading.

One example I found in Scikit-Learn's feature extraction module simply passes the type of the input as an argument while initializing the object. I assume that once this parameter is set it's just a matter of handling the different cases internally:

if input == 'filename':
    # glob and read files
elif input == 'content':
    # do something else

While this is easy to implement, it doesn't look like a very elegant solution. So I am wondering if there is a better way to handle multiple types of inputs to initialize a class that I am overlooking.

Upvotes: 4

Views: 172

Answers (2)

Bakuriu
Bakuriu

Reputation: 102039

You can use duck typing. First you consider as if the arguments are of the type X, if they raise an exception, then you assume they are of type Y, etc:

class Text(object):
    def __init__(self, *init_vals):
        try:
            fileobjs = [open(fname) for fname in init_vals]
        except TypeError:
            # Then we consider them as file objects.
            fileobjs = init_vals

        try:
            senteces = [parse_sentences(fobj) for fobj in fileobjs]
        except TypeError:
            # Then init_vals are Sentence objects.
            senteces = fileobjs

Note that the absence of type checking means that the method actually accepts any type that implement one of the interfaces you actually use (e.g. file-like object, Sentence-like object etc.).

This method becomes quite heavy if you want to support a lot of different types, but I'd consider that bad code design. Accepting more than 2,3,4 types as initializers will probably confuse any programmer that uses your class, since he will always have to think "wait, did X also accept Y, or was it Z that accepted Y...".

It's probably better design the constructor to only accept 2,3 different interfaces and provide the user with some function/class that allows him to convert some often used types to these interfaces.

Upvotes: 2

BrenBarn
BrenBarn

Reputation: 251618

One way is to just create classmethods with different names for the different ways of instantiating the object:

class Text(object):
    def __init__(self, data):
        # handle data in whatever "basic" form you need

    @classmethod
    def fromFiles(cls, files):
        # process list of filenames into the form that `__init__` needs
        return cls(processed_data)

    @classmethod
    def fromSentences(cls, sentences):
        # process list of Sentence objects into the form that `__init__` needs
        return cls(processed_data)

This way you just create one "real" or "canonical" initialization method that accepts whatever "lowest common denominator" format you want. The specialized fromXXX methods can preprocess different types of input to convert them into the form they need to be in to pass to that canonical instantiation. The idea is that you call Text.fromFiles(...) to make a Text from filenames, or Text.fromSentences(...) to make a Text from sentence objects.

It can also be acceptable to do some simple type-checking if you just want to accept one of a few enumerable kinds of input. For instance, it's not uncommon for a class to accept either a filename (as a string) or a file object. In that case you'd do:

def __init__(self, file):
    if isinstance(file, basestring):
        # If a string filename was passed in, open the file before proceeding
        file = open(file)
    # Now you can handle file as a file object

This becomes unwieldy if you have many different types of input to handle, but if it's something relatively contained like this (e.g., an object or the string "name" that can be used to get that object), it can be simpler than the first method I showed.

Upvotes: 4

Related Questions