Ozob
Ozob

Reputation: 111

Why doesn't Python have an instancemethod function?

Why doesn't Python have an instancemethod function analogous to staticmethod and classmethod?

Here is how this arose for me. Suppose I have an object which I know will be hashed frequently and whose hash is expensive to calculate. Under this assumption, it is reasonable to compute the hash value once and cache it, as in the following toy example:

class A:
    def __init__(self, x):
        self.x = x
        self._hash_cache = hash(self.x)

    def __hash__(self):
        return self._hash_cache

The __hash__ function in this class does very little, just an attribute lookup and a return. Naively, it seems it ought to be equivalent to instead write:

class B:
    def __init__(self, x):
        self.x = x
        self._hash_cache = hash(self.x)

    __hash__ = operator.attrgetter('_hash_cache')

According to the documentation, operator.attrgetter returns a callable object that fetches the given attribute from its operand. If its operand is self, then it will return self._hash_cache, which is the desired result. Unfortunately this does not work:

>>> hash(A(1))
1
>>> hash(B(1))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: attrgetter expected 1 arguments, got 0

The reason for this is as follows. If one reads the descriptor HOWTO, one finds that class dictionaries store methods as functions; functions are non-data descriptors whose __get__ method returns a bound method. But operator.attrgetter does not return a function; it returns a callable object. And in fact, it is a callable object with no __get__ method:

>>> hasattr(operator.attrgetter('_hash_cache'), '__get__')
False

Lacking a __get__ method, this of course will not automatically be turned into a bound method. We can make a bound method from it using types.MethodType, but using it in our class B would require creating a bound method for every object instance and assigning it to __hash__.

We can see the fact that operator.attrgetter has no __get__ directly if we browse the CPython source. I'm not very familiar with the CPython API, but I believe that what's going on is as follows. The definition of the attrgetter_type is in Modules/_operator.c, at line 1439 as I write this. This type sets tp_descr_get to 0. And according to the type object documentation, that means an object whose type is attrgetter_type will not have a __get__.

Of course, if we give ourselves a __get__ method, then everything works. This is the case in the first example above, where __hash__ is actually a function and not just a callable. It's also true in some other cases. For example, if we want to lookup a class attribute, we could write the following:

class C:
    y = 'spam'
    get_y = classmethod(operator.attrgetter('y'))

As written this is terribly un-Pythonic (though it might be defensible if there were a strange custom __getattr__ for which we wanted to provide convenience functions). But at least it gives the desired result:

>>> C.get_y()
'spam'

I can't think of any reason why it would be bad for attrgetter_type to implement __get__. But on the other hand, even if it did, there would be other situations where we run into trouble. For example, suppose we have a class whose instances are callable:

class D:
    def __call__(self, other):
        ...

We can't use an instance of this class as a class attribute and expect instance lookups to generate bound methods. For instance,

d = D()

class E:
    apply_d = d

When D.__call__ is called, it will receive self but not other, and that generates a TypeError. This example might be a little far-fetched, but I'd be a little surprised if nobody had ever encountered something like this in practice. It could be fixed by giving D a __get__ method; but if D is from a third-party library that could be inconvenient.

It seems that the easiest solution would be to have an instancemethod function. Then we could write __hash__ = instancemethod(operator.attrgetter('_hash_cache')) and apply_d = instancemethod(d) and they would both work as intended. Yet, as far as I know, no such function exists. Hence my question: Why is there no instancemethod function?


EDIT: Just to be clear, the functionality of instancemethod would be equivalent to:

def instancemethod(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        return func(*args, **kwargs)
    return wrapper

This could be applied as in the original question above. One could also imagine writing a class decorator that could be applied to D that would give it a __get__ method; but this code doesn't do this.

So I'm not talking about adding a new feature to Python. Really the question is one of language design: Why not provide it as, say, functools.instancemethod? If the answer is simply, "The use cases are so obscure that nobody's bothered," that's okay. But I would be happy to learn about other reasons, if there are any.

Upvotes: 4

Views: 237

Answers (2)

Ozob
Ozob

Reputation: 111

I have a satisfying answer to my question. Python does have the internal interface necessary for an instancemethod function, but it's not exposed by default.

import ctypes
import operator

instancemethod = ctypes.pythonapi.PyInstanceMethod_New
instancemethod.argtypes = (ctypes.py_object,)
instancemethod.restype = ctypes.py_object

class A:
    def __init__(self, x):
        self.x = x
        self._hash_cache = hash(x)

    __hash__ = instancemethod(operator.attrgetter('_hash_cache'))

a = A(1)
print(hash(a))

The instancemethod function this creates works in essentially the same way as classmethod and staticmethod. These three functions return new objects of types instancemethod, classmethod, and staticmethod, respectively. We can see how they work by looking at Objects/funcobject.c. These objects all have __func__ members which store a callable object. They also have a __get__. For a staticmethod object, the __get__ returns __func__ unchanged. For a classmethod object, __get__ returns a bound method object, where the binding is to the class object. And for a staticmethod object, __get__ returns a bound method object, where the binding is to the object instance. This is precisely the same behavior as __get__ for a function object and is exactly what we want.

The only documentation on these objects seems to be in the Python C API here. My guess is that they're not exposed because they're so rarely needed. I think it would be nice to have PyInstanceMethod_New available as functools.instancemethod.

Upvotes: 0

Olivier Melan&#231;on
Olivier Melan&#231;on

Reputation: 22294

There is no instancemethod decorator because this is the default behaviour for functions declared inside a class.

class A:
    ...

    # This is an instance method
    def __hash__(self):
        return self._hash_cache

Any callable which does not have a __get__ method can thus be wrapped into an instance method like so.

class A:
    def instance_method(*args):
        return any_callable(*args)

Thus creating an instancemethod decorator would just add another syntax for a feature which already exists. This would go against the saying that there should be one-- and preferably only one --obvious way to do it.

Side note

If it is so expensive to hash your instances, you might want to avoid calling you hash function on instantiation and delay it for when the object are hashed.

One way to do that could be to set the attribute _hash_cache in __hash__ instead of __init__. Although, let me suggest a slightly more self-contained methods which relies on caching your hash.

from weakref import finalize

class CachedHash:
    def __init__(self, x):
        self.x = x

    def __hash__(self, _cache={}):
        if id(self) not in _cache:
            finalize(self, _cache.pop, id(self))
            _cache[id(self)] = hash(self.x) # or some complex hash function
        return _cache[id(self)]

The use of finalize ensures the cache is cleared of an id when its instance is garbage collected.

Upvotes: 3

Related Questions