How does adding a (500x5000) and (5000x1) matrix result in a (500x5000) matrix?

Question

I am following a tutorial in an iPython notebook. My intention is calculating (X^2 - X_train)^2, storing the result in dists. The following code seems to work. I don't understand how it works however.

Why does (2*inner_prod + train_sum) which adds differently-sized matrices yield a 500x5000 matrix?

How are the matrices processed in the calculation of dist?

    test_sum = np.sum(np.square(X), axis=1) # summed each example #(500x1)
    train_sum = np.sum(np.square(self.X_train), axis=1) # summed each example #(5000x1)
    inner_prod = np.dot(X, self.X_train.T) #matrix multiplication for 2-D arrays (500x3072)*(3072x5000)=(500x5000)
    print inner_prod.shape
    print X.shape
    print self.X_train.T.shape
    print test_sum.shape
    print test_sum.size
    print train_sum.shape
    print train_sum.size
    print test_sum.reshape(-1,1).shape
    # how... does reshaping work???
    print (2*inner_prod+train_sum).shape
    dists = np.sqrt(np.reshape(test_sum,(-1,1)) - 2 * inner_prod + train_sum) # (500x1) - 2*(500x5000) + (5000x1) = (500x5000)
    print dists.shape

The print statements give the following:

(500L, 5000L)
(500L, 3072L)
(3072L, 5000L)
(500L,)
500
(5000L,)
5000
(500L, 1L)
(500L, 5000L)
(500L, 5000L)

hpaulj · Accepted Answer

print train_sum.shape           # (5000,)
print train_sum.size
print test_sum.reshape(-1,1).shape    # (5000,1)
# how... does reshaping work???
print (2*inner_prod+train_sum).shape

test_sum.reshape(-1,1) returns as new array with a new shape (but shared data). It does not reshape test_sum itself.

So the addition broadcasting does:

(500,5000) + (5000,) => (500,5000)+(1,5000)=>(500,5000)

If it had done the reshape, you'd have gotten an error.

(500,5000) + (5000,1) => error

In [68]: np.ones((500,5000))+np.zeros((5000,1))
ValueError: operands could not be broadcast together with shapes (500,5000) (5000,1)

There's really only one way to add that (500,5000) array and the (5000,) one, and that's what you got without (effective) reshape.

train_sum.shape = (-1,1) acts in place, but isn't used as often as reshape. Stick with the reshape, but use it right.

How does adding a (500x5000) and (5000x1) matrix result in a (500x5000) matrix?

Answers (1)

Related Questions