Reputation: 2299
I'm trying to understand how numpy works when you try to call the dot product of two row vectors.
I have this code:
X = np.array([[1,2,3]])
THETA = np.array([[1,2,3]])
print X.dot(THETA)
This gives me the error:
ValueError: shapes (1,3) and (1,3) not aligned: 3 (dim 1) != 1 (dim 0)
I thought that you could take the dot product of two row vectors however to get:
x1*theta1 + x2*theta2 + x3*theta3
And this would also transfer to the dot product of two column vectors.
The weird part is, I have to take the transpose of the second matrix in order to actually use the dot product:
print X.dot(THETA.T)
array([[14]])
However, I didn't think this would actually work, and why it would work instead of just doing a row dot row operation. Can anyone help me understand what's going on? Is it some rule in linear algebra that I forgot from long ago?
Upvotes: 2
Views: 3674
Reputation: 24699
The alignment error you're seeing is because you're trying to represent a 1D vector as a 2D array.
In [1]: import numpy as np
In [2]: X = np.array([1,2,3])
In [3]: THETA = np.array([1,2,3])
In [4]: print X.dot(THETA)
14
In [5]: print X.dot(THETA.T)
14
And:
x1*theta1 + x2*theta2 + x3*theta3 =
1*1 + 2*2 + 3*3 =
14
Upvotes: 0
Reputation: 280973
dot
for 2D input is matrix multiplication, not a dot product. What you're seeing is just the result of the normal rules of matrix multiplication. If you want a vector dot product, the easiest way is to use 1D vectors, with no superfluous second dimension:
X = np.array([1, 2, 3])
THETA = np.array([1, 2, 3])
print X.dot(THETA)
dot
-ting two 1D arrays takes a dot product and produces a scalar result.
If you want to use row and column vectors, then by the standard rules of matrix multiplication, you need to multiply a 1-by-N array (a row vector) by an N-by-1 array (a column vector) to get a 1-by-1 result, and NumPy will give you a 1-by-1 array rather than a scalar.
Upvotes: 2