Numpy, matrix multiplication

Question

It is a simple question, when making predictions, is there any advantage to doing W @ X.T over X @ Wor Is it just a waste of resources to make the transpose to predict values?

For example:

print(W)
# array([18.56711151,  4.51542094])

print(X)
# [[1.    6.575]
#  [1.    6.421]
#  [1.    7.185]
#  ...
#  [1.    6.976]
#  [1.    6.794]
#  [1.    6.03 ]]

yP = W @ X.T
yP_ X @ W 
yP == yP_
#True

hpaulj · Accepted Answer

In [57]: W = np.array([1,2])                                                             
In [58]: X = np.ones((4,2))                                                              
In [59]: X[:,1] = [1,2,3,4]                                                              
In [60]: W                                                                               
Out[60]: array([1, 2])
In [61]: X                                                                               
Out[61]: 
array([[1., 1.],
       [1., 2.],
       [1., 3.],
       [1., 4.]])
In [62]: X.T                                                                             
Out[62]: 
array([[1., 1., 1., 1.],
       [1., 2., 3., 4.]])
In [63]: W@X.T                                                                           
Out[63]: array([3., 5., 7., 9.])
In [64]: X@W                                                                             
Out[64]: array([3., 5., 7., 9.])

The first @ uses (4,) and (2,4) shape to produce a (4,).

The second uses a (4,2) and (4,).

Look at matmul for rules on how it handles 1d arguments such as your W.

I find the einsum notation helps clarify which axes are combined:

In [75]: np.einsum('i,ji->j', W,X)                                                       
Out[75]: array([3., 5., 7., 9.])

The effect of switch arguments in @ might be clearer if W was 2d.

In [76]: W1                 # (2,1) shape                                                             
Out[76]: 
array([[1],
       [2]])
In [77]: W1.T@X.T           # (1,2) with (2,4) produces (1,4)                                                                       
Out[77]: array([[3., 5., 7., 9.]])
In [78]: _.shape                                                                         
Out[78]: (1, 4)

(4,2) with (2,1) produces (4,1)

In [79]: X@W1                                                                            
Out[79]: 
array([[3.],
       [5.],
       [7.],
       [9.]])
In [80]: _.shape                                                                         
Out[80]: (4, 1)

[77] and [80] can be transposed to each other, e.g ((W1.T)@(X.T)).T to produce (4,1).

Numpy, matrix multiplication

Answers (1)

Related Questions