Reputation: 7
I am following a tutorial in an iPython notebook. My intention is calculating (X^2 - X_train)^2, storing the result in dists. The following code seems to work. I don't understand how it works however.
Why does (2*inner_prod + train_sum) which adds differently-sized matrices yield a 500x5000 matrix?
How are the matrices processed in the calculation of dist?
test_sum = np.sum(np.square(X), axis=1) # summed each example #(500x1)
train_sum = np.sum(np.square(self.X_train), axis=1) # summed each example #(5000x1)
inner_prod = np.dot(X, self.X_train.T) #matrix multiplication for 2-D arrays (500x3072)*(3072x5000)=(500x5000)
print inner_prod.shape
print X.shape
print self.X_train.T.shape
print test_sum.shape
print test_sum.size
print train_sum.shape
print train_sum.size
print test_sum.reshape(-1,1).shape
# how... does reshaping work???
print (2*inner_prod+train_sum).shape
dists = np.sqrt(np.reshape(test_sum,(-1,1)) - 2 * inner_prod + train_sum) # (500x1) - 2*(500x5000) + (5000x1) = (500x5000)
print dists.shape
The print statements give the following:
(500L, 5000L)
(500L, 3072L)
(3072L, 5000L)
(500L,)
500
(5000L,)
5000
(500L, 1L)
(500L, 5000L)
(500L, 5000L)
Upvotes: 0
Views: 116
Reputation: 231475
print train_sum.shape # (5000,)
print train_sum.size
print test_sum.reshape(-1,1).shape # (5000,1)
# how... does reshaping work???
print (2*inner_prod+train_sum).shape
test_sum.reshape(-1,1)
returns as new array with a new shape (but shared data). It does not reshape test_sum
itself.
So the addition broadcasting does:
(500,5000) + (5000,) => (500,5000)+(1,5000)=>(500,5000)
If it had done the reshape, you'd have gotten an error.
(500,5000) + (5000,1) => error
In [68]: np.ones((500,5000))+np.zeros((5000,1))
ValueError: operands could not be broadcast together with shapes (500,5000) (5000,1)
There's really only one way to add that (500,5000) array and the (5000,) one, and that's what you got without (effective) reshape.
train_sum.shape = (-1,1)
acts in place, but isn't used as often as reshape
. Stick with the reshape, but use it right.
Upvotes: 1