LinAlg
LinAlg

Reputation: 103

Dimensions of matrix-ndarray multiplication

I am wondering what the best way is to write code that can deal with sparse and dense matrices. The following example shows my concern:

input:

A = np.asmatrix(np.ones((2,3)))
B = sps.csr_matrix(A)
x = np.ones(3)
print(A.dot(x).shape, B.dot(x).shape)

output:

(1, 2) (2,)

The first result seems wrong from a mathematical perspective. Feeding the result to a function that expects a vector gives an error. I could cast the output into an array and squeeze it:

print(np.asarray(A.dot(x)).squeeze().shape)
(2,)

Is this the best solution?

Upvotes: 0

Views: 43

Answers (1)

hpaulj
hpaulj

Reputation: 231335

dense

In [135]: A.dot(x)
Out[135]: matrix([[3., 3.]])
In [136]: np.ones((2,3)).dot(x)
Out[136]: array([3., 3.])

np.matrix is a subclass of ndarray that must be 2d, and tries, as much as possible, to return np.matrix (2d) results. Hence Out[135] is 2d (1,2).

Out[136] on the other hand is a (2,3) dot (3,) => (2,). But it is worth keeping mind that dot has some special rules when one or other of the inputs is 1d.

Pure np.matrix dot:

In [161]: A.dot(np.matrix(x).T)   # A*np.matrix(x).T
Out[161]: 
matrix([[3.],
        [3.]])

sparse

While sparse matrix is modeled on np.matrix, it is its own class, and behaves differently in many cases. dot of sparse and dense can be tricky.

In [152]: B.shape
Out[152]: (2, 3)
In [153]: B.dot(x)
Out[153]: array([3., 3.])

Effectively the B has been turned into an array first:

In [154]: B.A.dot(x)         # B.toarray()
Out[154]: array([3., 3.])

But this is wrong, because it attempts to use np.array(B) instead of the proper B.toarray():

In [155]: np.dot(B,x)
Out[155]: 
array([<2x3 sparse matrix of type '<class 'numpy.float64'>'
    with 6 stored elements in Compressed Sparse Row format>,
       <2x3 sparse matrix of type '<class 'numpy.float64'>'
    with 6 stored elements in Compressed Sparse Row format>,
       <2x3 sparse matrix of type '<class 'numpy.float64'>'
    with 6 stored elements in Compressed Sparse Row format>], dtype=object)
In [156]: np.array(B)
Out[156]: 
array(<2x3 sparse matrix of type '<class 'numpy.float64'>'
    with 6 stored elements in Compressed Sparse Row format>, dtype=object)

If both variables are sparse, then the result is also sparse. In this case the special dot handling of 1d arrays does not apply. A (2,3) has to multiply a (3,1), producing a (2,1):

In [158]: B*sparse.csr_matrix(x).T
Out[158]: 
<2x1 sparse matrix of type '<class 'numpy.float64'>'
    with 2 stored elements in Compressed Sparse Row format>
In [159]: _.A
Out[159]: 
array([[3.],
       [3.]])

In sum, mixing matrix types can be confusing. It would be cleanest to convert everything to ndarray first. But it really depends on what kind of output you want.

Upvotes: 1

Related Questions