max
max

Reputation: 52243

Function closure vs. callable class

In many cases, there are two implementation choices: a closure and a callable class. For example,

class F:
  def __init__(self, op):
    self.op = op
  def __call__(self, arg1, arg2):
    if (self.op == 'mult'):
      return arg1 * arg2
    if (self.op == 'add'):
      return arg1 + arg2
    raise InvalidOp(op)

f = F('add')

or

def F(op):
  if op == 'or':
    def f_(arg1, arg2):
      return arg1 | arg2
    return f_
  if op == 'and':
    def g_(arg1, arg2):
      return arg1 & arg2
    return g_
  raise InvalidOp(op)

f = F('add')

What factors should one consider in making the choice, in either direction?

I can think of two:

Am I correct in these? What else could be added?

Upvotes: 19

Views: 7511

Answers (6)

vahvero
vahvero

Reputation: 663

Mr. Hettinger's answer still is true ten years later in Python3.10. For anyone wondering:

from timeit import timeit
class A: # Naive class
    def __init__(self, op):
        if op == "mut":
            self.exc = lambda x, y: x * y
        elif op == "add":
            self.exc = lambda x, y: x + y
    def __call__(self, x, y):
        return self.exc(x,y)

class B: # More optimized class
    __slots__ = ('__call__')
    def __init__(self, op):
        if op == "mut":
            self.__call__ = lambda x, y: x * y
        elif op == "add":
            self.__call__ = lambda x, y: x + y

def C(op): # Closure
    if op == "mut":
        def _f(x,y):
            return x * y
    elif op == "add":
        def _f(x,t):
            return x + y
    return _f

a = A("mut")
b = B("mut")
c = C("mut")
print(timeit("[a(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000)) 
# 26.47s naive class
print(timeit("[b(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000)) 
# 18.00s optimized class
print(timeit("[c(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000)) 
# 12.12s closure

Using closure seems to offer significant speed gains in cases where the call number is high. However, classes have extensive customization and are superior choice at times.

Upvotes: 2

jsbueno
jsbueno

Reputation: 110261

I consider the class approach to be easier to understand at one glance, and therefore, more maintainable. As this is one of the premises of good Python code, I think that all things being equal, one is better off using a class rather than a nested function. This is one of the cases where the flexible nature of Python makes the language violate the "there should be one, and preferably only one, obvious way of doing something" predicate for coding in Python.

The performance difference for either side should be negligible - and if you have code where performance matters at this level, you certainly should profile it and optimize the relevant parts, possibly rewriting some of your code as native code.

But yes, if there was a tight loop using the state variables, assessing the closure variables should be slight faster than assessing the class attributes. Of course, this would be overcome by simply inserting a line like op = self.op inside the class method, before entering the loop, making the variable access inside the loop to be made to a local variable - this would avoid an attribute look-up and fetching for each access. Again, performance differences should be negligible, and you have a more serious problem if you need this little much extra performance and are coding in Python.

Upvotes: 4

Yuval
Yuval

Reputation: 3433

Please note that because of an error previously found in my testing code, my original answer was incorrect. The revised version follows.

I made a small program to measure running time and memory consumption. I created the following callable class and a closure:

class CallMe:
    def __init__(self, context):
        self.context = context

    def __call__(self, *args, **kwargs):
        return self.context(*args, **kwargs)

def call_me(func):
    return lambda *args, **kwargs: func(*args, **kwargs)

I timed calls to simple functions accepting different number of arguments (math.sqrt() with 1 argument, math.pow() with 2 and max() with 12).

I used CPython 2.7.10 and 3.4.3+ on Linux x64. I was only able to do memory profiling on Python 2. The source code I used is available here.

My conclusions are:

  • Closures run faster than equivalent callable classes: about 3 times faster on Python 2, but only 1.5 times faster on Python 3. The narrowing is both because closure became slower and callable classes slower.
  • Closures take less memory than equivalent callable classes: roughly 2/3 of the memory (only tested on Python 2).
  • While not part of the original question, it's interesting to note that the run time overhead for calls made via a closure is roughly the same as a call to math.pow(), while via a callable class it is roughly double that.

These are very rough estimates, and they may vary with hardware, operating system and the function you're comparing it too. However, it gives you an idea about the impact of using each kind of callable.

Therefore, this supports (conversely to what I've written before), that the accepted answer given by @RaymondHettinger is correct, and closures should be preferred for indirect calls, at least as long as it doesn't impede on readability. Also, thanks to @AXO for pointing out the mistake in my original code.

Upvotes: 4

Adam Donahue
Adam Donahue

Reputation: 1658

I realize this is an older posting, but one factor I didn't see listed is that in Python (pre-nonlocal) you cannot modify a local variable contained in the referencing environment. (In your example such modification is not important, but technically speaking the lack of being able to modify such a variable means it's not a true closure.)

For example, the following code doesn't work:

def counter():
    i = 0
    def f():
        i += 1
        return i
    return f

c = counter()
c()

The call to c above will raise a UnboundLocalError exception.

This is easy to get around by using a mutable, such as a dictionary:

def counter():
    d = {'i': 0}
    def f():
        d['i'] += 1
        return d['i']
    return f

c = counter()
c()     # 1
c()     # 2

but of course that's just a workaround.

Upvotes: 4

Raymond Hettinger
Raymond Hettinger

Reputation: 226256

Closures are faster. Classes are more flexible (i.e. more methods available than just __call__).

Upvotes: 17

ony
ony

Reputation: 13223

I'd re-write class example with something like:

class F(object):
    __slots__ = ('__call__')
    def __init__(self, op):
        if op == 'mult':
            self.__call__ = lambda a, b: a * b
        elif op == 'add':
            self.__call__ = lambda a, b: a + b
        else:
            raise InvalidOp(op)

That gives 0.40 usec/pass (function 0.31, so it 29% slower) at my machine with Python 3.2.2. Without using object as a base class it gives 0.65 usec/pass (i.e. 55% slower than object based). And by some reason code with checking op in __call__ gives almost the same results as if it was done in __init__. With object as a base and check inside __call__ gives 0.61 usec/pass.

The reason why would you use classes might be polymorphism.

class UserFunctions(object):
    __slots__ = ('__call__')
    def __init__(self, name):
        f = getattr(self, '_func_' + name, None)
        if f is None: raise InvalidOp(name)
        else: self.__call__ = f

class MyOps(UserFunctions):
    @classmethod
    def _func_mult(cls, a, b): return a * b
    @classmethod
    def _func_add(cls, a, b): return a + b

Upvotes: -1

Related Questions