xnx
xnx

Reputation: 25478

numpy matrix trace behaviour

If X is a NumPy matrix object, why does np.trace(X) return a scalar (as expected) but X.trace() return a 1x1 matrix object?

>>> X = np.matrix([[1, 2], [3, 4]])
>>> np.trace(X)
5
>>> X.trace()
matrix([[5]])   # Why not just 5, which would be more useful?

I'm using NumPy 1.7.1, but don't see anything in the release notes of 1.8 to suggest anything's changed.

Upvotes: 3

Views: 2029

Answers (2)

hpaulj
hpaulj

Reputation: 231325

Because X.trace is coded that way! The matrix documentation says:

A matrix is a specialized 2-D array that retains its 2-D nature through operations.

np.trace is coded as (using ndarray.trace):

return asarray(a).trace(offset, axis1, axis2, dtype, out)

It's harder to follow how the matrix trace is evaluated. But looking at https://github.com/numpy/numpy/blob/master/numpy/matrixlib/defmatrix.py

I suspect it is equivalent to:

np.asmatrix(X.A.trace())

In that same file, sum is defined as:

return N.ndarray.sum(self, axis, dtype, out, keepdims=True)._collapse(axis)

mean, prod etc do the same. _collapse returns a scalar if axis is None. There isn't an explicit definition for a matrix trace, so it probably uses __array_finalize__. In other words, trace returns the default matrix type.

Several constructs that return the scalar are:

X.A.trace()
X.diagonal().sum()
X.trace()._collapse(None)

Upvotes: 2

M4rtini
M4rtini

Reputation: 13539

def __array_finalize__(self, obj):
    self._getitem = False
    if (isinstance(obj, matrix) and obj._getitem): return
    ndim = self.ndim
    if (ndim == 2):
        return
    if (ndim > 2):
        newshape = tuple([x for x in self.shape if x > 1])
        ndim = len(newshape)
        if ndim == 2:
            self.shape = newshape
            return
        elif (ndim > 2):
            raise ValueError("shape too large to be a matrix.")
    else:
        newshape = self.shape
    if ndim == 0:
        self.shape = (1, 1)
    elif ndim == 1:
        self.shape = (1, newshape[0])
    return

This is from the matrix definition, subclassing ndarray. The trace function is not changed so it is actually the same function getting called.

This function is getting called every time a matrix object is created. The problem is that if ndims is less than 2, it is forced to be larger.

Then here comes some educated guess work, which i think should be true, but i'm not familiar enough with numpy codebase to figure it out exactly.

  • np.trace and ndarray.trace are two different functions.
    • np.trace is defined in "core/fromnumeric.py"
    • ndarray.trace is defined in "core/src/multiarray/methods.c or calculation.c"
  • np.trace converts the object to a ndarray
  • ndarray.trace will try to keep the object as the subclassed object.
    • Unsure about this, i did not understand squat of that code tbh

both trace functions, will keep the result as an array object (subclassed or not). If it's a single value, it will return that single value, or else it returns the array object.

Since the result is kept as a matrix object, it will be forced to be two dimensional by the function above here. And because of this, it will not be returned as a single value, but as the matrix object.

This conclusion is further backed up by editing the _array_finalize__ function like this:

def __array_finalize__(self, obj):
    self._getitem = False
    if (isinstance(obj, matrix) and obj._getitem): return
    ndim = self.ndim
    if (ndim == 2):
        return
    if (ndim > 2):
        newshape = tuple([x for x in self.shape if x > 1])
        ndim = len(newshape)
        if ndim == 2:
            self.shape = newshape
            return
        elif (ndim > 2):
            raise ValueError("shape too large to be a matrix.")
    else:
        newshape = self.shape
    return
    if ndim == 0:

        self.shape = (1, 1)
    elif ndim == 1:

        self.shape = (1, newshape[0])
    return

notice the new return before the last if-else check. Now the result of X.trace() is a single value.

THIS IS NOT A FIX, revert the change if you try this yourself.
They have good reasons for doing this

np.tracedoes not have this problems since it convert's it to an array object directly.

The code for np.trace is (without the docstring):

def trace(a, offset=0, axis1=0, axis2=1, dtype=None, out=None):

    return asarray(a).trace(offset, axis1, axis2, dtype, out)

From the docstring of asarray

Array interpretation of a. No copy is performed if the input is already an ndarray. If a is a subclass of ndarray, a base class ndarray is returned.

Upvotes: 2

Related Questions