Reputation: 52243
In many cases, there are two implementation choices: a closure and a callable class. For example,
class F:
def __init__(self, op):
self.op = op
def __call__(self, arg1, arg2):
if (self.op == 'mult'):
return arg1 * arg2
if (self.op == 'add'):
return arg1 + arg2
raise InvalidOp(op)
f = F('add')
or
def F(op):
if op == 'or':
def f_(arg1, arg2):
return arg1 | arg2
return f_
if op == 'and':
def g_(arg1, arg2):
return arg1 & arg2
return g_
raise InvalidOp(op)
f = F('add')
What factors should one consider in making the choice, in either direction?
I can think of two:
It seems a closure would always have better performance (can't think of a counterexample).
I think there are cases when a closure cannot do the job (e.g., if its state changes over time).
Am I correct in these? What else could be added?
Upvotes: 19
Views: 7511
Reputation: 663
Mr. Hettinger's answer still is true ten years later in Python3.10. For anyone wondering:
from timeit import timeit
class A: # Naive class
def __init__(self, op):
if op == "mut":
self.exc = lambda x, y: x * y
elif op == "add":
self.exc = lambda x, y: x + y
def __call__(self, x, y):
return self.exc(x,y)
class B: # More optimized class
__slots__ = ('__call__')
def __init__(self, op):
if op == "mut":
self.__call__ = lambda x, y: x * y
elif op == "add":
self.__call__ = lambda x, y: x + y
def C(op): # Closure
if op == "mut":
def _f(x,y):
return x * y
elif op == "add":
def _f(x,t):
return x + y
return _f
a = A("mut")
b = B("mut")
c = C("mut")
print(timeit("[a(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000))
# 26.47s naive class
print(timeit("[b(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000))
# 18.00s optimized class
print(timeit("[c(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000))
# 12.12s closure
Using closure seems to offer significant speed gains in cases where the call number is high. However, classes have extensive customization and are superior choice at times.
Upvotes: 2
Reputation: 110261
I consider the class approach to be easier to understand at one glance, and therefore, more maintainable. As this is one of the premises of good Python code, I think that all things being equal, one is better off using a class rather than a nested function. This is one of the cases where the flexible nature of Python makes the language violate the "there should be one, and preferably only one, obvious way of doing something" predicate for coding in Python.
The performance difference for either side should be negligible - and if you have code where performance matters at this level, you certainly should profile it and optimize the relevant parts, possibly rewriting some of your code as native code.
But yes, if there was a tight loop using the state variables, assessing the closure variables should be slight faster than assessing the class attributes. Of course, this would be overcome by simply inserting a line like op = self.op
inside the class method, before entering the loop, making the variable access inside the loop to be made to a local variable - this would avoid an attribute look-up and fetching for each access. Again, performance differences should be negligible, and you have a more serious problem if you need this little much extra performance and are coding in Python.
Upvotes: 4
Reputation: 3433
Please note that because of an error previously found in my testing code, my original answer was incorrect. The revised version follows.
I made a small program to measure running time and memory consumption. I created the following callable class and a closure:
class CallMe:
def __init__(self, context):
self.context = context
def __call__(self, *args, **kwargs):
return self.context(*args, **kwargs)
def call_me(func):
return lambda *args, **kwargs: func(*args, **kwargs)
I timed calls to simple functions accepting different number of arguments (math.sqrt()
with 1 argument, math.pow()
with 2 and max()
with 12).
I used CPython 2.7.10 and 3.4.3+ on Linux x64. I was only able to do memory profiling on Python 2. The source code I used is available here.
My conclusions are:
math.pow()
, while via a callable class it is roughly double that.These are very rough estimates, and they may vary with hardware, operating system and the function you're comparing it too. However, it gives you an idea about the impact of using each kind of callable.
Therefore, this supports (conversely to what I've written before), that the accepted answer given by @RaymondHettinger is correct, and closures should be preferred for indirect calls, at least as long as it doesn't impede on readability. Also, thanks to @AXO for pointing out the mistake in my original code.
Upvotes: 4
Reputation: 1658
I realize this is an older posting, but one factor I didn't see listed is that in Python (pre-nonlocal) you cannot modify a local variable contained in the referencing environment. (In your example such modification is not important, but technically speaking the lack of being able to modify such a variable means it's not a true closure.)
For example, the following code doesn't work:
def counter():
i = 0
def f():
i += 1
return i
return f
c = counter()
c()
The call to c above will raise a UnboundLocalError exception.
This is easy to get around by using a mutable, such as a dictionary:
def counter():
d = {'i': 0}
def f():
d['i'] += 1
return d['i']
return f
c = counter()
c() # 1
c() # 2
but of course that's just a workaround.
Upvotes: 4
Reputation: 226256
Closures are faster. Classes are more flexible (i.e. more methods available than just __call__).
Upvotes: 17
Reputation: 13223
I'd re-write class
example with something like:
class F(object):
__slots__ = ('__call__')
def __init__(self, op):
if op == 'mult':
self.__call__ = lambda a, b: a * b
elif op == 'add':
self.__call__ = lambda a, b: a + b
else:
raise InvalidOp(op)
That gives 0.40 usec/pass (function 0.31, so it 29% slower) at my machine with Python 3.2.2. Without using object
as a base class it gives 0.65 usec/pass (i.e. 55% slower than object
based). And by some reason code with checking op
in __call__
gives almost the same results as if it was done in __init__
. With object
as a base and check inside __call__
gives 0.61 usec/pass.
The reason why would you use classes might be polymorphism.
class UserFunctions(object):
__slots__ = ('__call__')
def __init__(self, name):
f = getattr(self, '_func_' + name, None)
if f is None: raise InvalidOp(name)
else: self.__call__ = f
class MyOps(UserFunctions):
@classmethod
def _func_mult(cls, a, b): return a * b
@classmethod
def _func_add(cls, a, b): return a + b
Upvotes: -1