Reputation: 424
It is a simple question, when making predictions, is there any advantage to doing W @ X.T
over X @ W
or
Is it just a waste of resources to make the transpose to predict values?
For example:
print(W)
# array([18.56711151, 4.51542094])
print(X)
# [[1. 6.575]
# [1. 6.421]
# [1. 7.185]
# ...
# [1. 6.976]
# [1. 6.794]
# [1. 6.03 ]]
yP = W @ X.T
yP_ X @ W
yP == yP_
#True
Upvotes: 0
Views: 87
Reputation: 231665
In [57]: W = np.array([1,2])
In [58]: X = np.ones((4,2))
In [59]: X[:,1] = [1,2,3,4]
In [60]: W
Out[60]: array([1, 2])
In [61]: X
Out[61]:
array([[1., 1.],
[1., 2.],
[1., 3.],
[1., 4.]])
In [62]: X.T
Out[62]:
array([[1., 1., 1., 1.],
[1., 2., 3., 4.]])
In [63]: [email protected]
Out[63]: array([3., 5., 7., 9.])
In [64]: X@W
Out[64]: array([3., 5., 7., 9.])
The first @
uses (4,) and (2,4) shape to produce a (4,).
The second uses a (4,2) and (4,).
Look at matmul
for rules on how it handles 1d arguments such as your W
.
I find the einsum
notation helps clarify which axes are combined:
In [75]: np.einsum('i,ji->j', W,X)
Out[75]: array([3., 5., 7., 9.])
The effect of switch arguments in @
might be clearer if W
was 2d.
In [76]: W1 # (2,1) shape
Out[76]:
array([[1],
[2]])
In [77]: [email protected] # (1,2) with (2,4) produces (1,4)
Out[77]: array([[3., 5., 7., 9.]])
In [78]: _.shape
Out[78]: (1, 4)
(4,2) with (2,1) produces (4,1)
In [79]: X@W1
Out[79]:
array([[3.],
[5.],
[7.],
[9.]])
In [80]: _.shape
Out[80]: (4, 1)
[77] and [80] can be transposed to each other, e.g ((W1.T)@(X.T)).T
to produce (4,1).
Upvotes: 1