Reputation: 457
Disclaimer: this is perhaps a quite subjective question with no 'right' answer but I'd appreciate any feedback on best-practices and program design. So here goes:
I am writing a library where text files are read into Text
objects. Now these might be initialized with a list of file-names or directly with a list of Sentence
objects. I am wondering what the best / most Pythonic way to do this might be because, if I understand correctly, Python doesn't directly support method overloading.
One example I found in Scikit-Learn
's feature extraction module simply passes the type of the input as an argument while initializing the object. I assume that once this parameter is set it's just a matter of handling the different cases internally:
if input == 'filename':
# glob and read files
elif input == 'content':
# do something else
While this is easy to implement, it doesn't look like a very elegant solution. So I am wondering if there is a better way to handle multiple types of inputs to initialize a class that I am overlooking.
Upvotes: 4
Views: 172
Reputation: 102039
You can use duck typing. First you consider as if the arguments are of the type X
, if they raise an exception, then you assume they are of type Y
, etc:
class Text(object):
def __init__(self, *init_vals):
try:
fileobjs = [open(fname) for fname in init_vals]
except TypeError:
# Then we consider them as file objects.
fileobjs = init_vals
try:
senteces = [parse_sentences(fobj) for fobj in fileobjs]
except TypeError:
# Then init_vals are Sentence objects.
senteces = fileobjs
Note that the absence of type checking means that the method actually accepts any type that implement one of the interfaces you actually use (e.g. file
-like object, Sentence
-like object etc.).
This method becomes quite heavy if you want to support a lot of different types, but I'd consider that bad code design. Accepting more than 2,3,4 types as initializers will probably confuse any programmer that uses your class, since he will always have to think "wait, did X
also accept Y
, or was it Z
that accepted Y
...".
It's probably better design the constructor to only accept 2,3 different interfaces and provide the user with some function/class that allows him to convert some often used types to these interfaces.
Upvotes: 2
Reputation: 251618
One way is to just create classmethods with different names for the different ways of instantiating the object:
class Text(object):
def __init__(self, data):
# handle data in whatever "basic" form you need
@classmethod
def fromFiles(cls, files):
# process list of filenames into the form that `__init__` needs
return cls(processed_data)
@classmethod
def fromSentences(cls, sentences):
# process list of Sentence objects into the form that `__init__` needs
return cls(processed_data)
This way you just create one "real" or "canonical" initialization method that accepts whatever "lowest common denominator" format you want. The specialized fromXXX
methods can preprocess different types of input to convert them into the form they need to be in to pass to that canonical instantiation. The idea is that you call Text.fromFiles(...)
to make a Text
from filenames, or Text.fromSentences(...)
to make a Text
from sentence objects.
It can also be acceptable to do some simple type-checking if you just want to accept one of a few enumerable kinds of input. For instance, it's not uncommon for a class to accept either a filename (as a string) or a file object. In that case you'd do:
def __init__(self, file):
if isinstance(file, basestring):
# If a string filename was passed in, open the file before proceeding
file = open(file)
# Now you can handle file as a file object
This becomes unwieldy if you have many different types of input to handle, but if it's something relatively contained like this (e.g., an object or the string "name" that can be used to get that object), it can be simpler than the first method I showed.
Upvotes: 4