Why are some numpy calls not implemented as methods?

Question

I always considered Python as a highly object-oriented programming language. Recently, I've been using numpy a lot and I'm beginning to wonder why a number of things are implemented there as functions only, and not as methods of the numpy.array (or ndarray) object.

If we have a given array a for example, you can do

a = np.array([1, 2, 3])

np.sum(a)
>>> 6
a.sum()
>>> 6

which seems just fine but there are a lot of calls that do not work in the same way as in:

np.amax(a)
>>> 3
a.amax()
>>> AttributeError: 'numpy.ndarray' object has no attribute 'amax'

I find this confusing, unintuitive and I do not see any reason behind it. There might be a good one, though; maybe someone can just enlighten me.

Robert Kern · Accepted Answer

When numpy was introduced as a successor to Numeric, a lot of things that were just functions and not methods were added as methods to the ndarray type. At this particular time, having the ability to subclass the array type was a new feature. It was thought that making some of these common functions methods would allow subclasses to do the right thing more easily. For example, having .sum() as a method is useful to allow the masked array type to ignore the masked values; you can write generic code that will work for both plain ndarrays and masked arrays without any branching.

Of course, you can't get rid of the functions. A nice feature of the functions is that they will accept any object that can be coerced into an ndarray, like a list of numbers, which would not have all of the ndarray methods. And the specific list of functions that were added as methods can't be all-inclusive; that's not good OO design, either. There's no need for all of the trig functions to be added as methods, for example.

The current list of methods were mostly chosen early in numpy's development for those that we thought were going to be useful as methods in terms of subclassing or notationally convenient. With more experience under our belt, we have mostly come to the opinion that we added too many methods (seriously, there's no need for .ptp() to be a method) and that subclassing ndarray is typically a bad idea for reasons that I won't go into here.

So take the list of methods as mostly a subset of the list of functions that are available, with the np.amin() and np.amax() as slight renames for the .min() and .max() methods to avoid aliasing the builtins.

Why are some numpy calls not implemented as methods?

Answers (1)

Related Questions