VansFannel
VansFannel

Reputation: 45921

Compute L2 distance with numpy using matrix multiplication

I'm trying to do it by myself the assignments from Stanford CS231n 2017 CNN course.

I'm trying to compute L2 distance using only matrix multiplication and sum broadcasting with Numpy. L2 distance is:

enter image description here

And I think I can do it if I use this formula:

enter image description here

The following code shows three methods to compute L2 distance. If I compare the output from the method compute_distances_two_loops with the output from method compute_distances_one_loop, both are equals. But I compare the output from the method compute_distances_two_loops with the output from the method compute_distances_no_loops, where I have implemented the L2 distance using only matrix multiplication and sum broadcasting, they are different.

def compute_distances_two_loops(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using a nested loop over both the training data and the 
test data.

Inputs:
- X: A numpy array of shape (num_test, D) containing test data.

Returns:
- dists: A numpy array of shape (num_test, num_train) where dists[i, j]
  is the Euclidean distance between the ith test point and the jth training
  point.
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
        for j in xrange(num_train):
            #####################################################################
            # TODO:                                                             #
            # Compute the l2 distance between the ith test point and the jth    #
            # training point, and store the result in dists[i, j]. You should   #
            # not use a loop over dimension.                                    #
            #####################################################################
            #dists[i, j] = np.sqrt(np.sum((X[i, :] - self.X_train[j, :]) ** 2))
            dists[i, j] = np.sqrt(np.sum(np.square(X[i, :] - self.X_train[j, :])))
            #####################################################################
            #                       END OF YOUR CODE                            #
            #####################################################################
    return dists

def compute_distances_one_loop(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using a single loop over the test data.

Input / Output: Same as compute_distances_two_loops
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    for i in xrange(num_test):
        #######################################################################
        # TODO:                                                               #
        # Compute the l2 distance between the ith test point and all training #
        # points, and store the result in dists[i, :].                        #
        #######################################################################
        dists[i, :] = np.sqrt(np.sum(np.square(self.X_train - X[i, :]), axis = 1))
        #######################################################################
        #                         END OF YOUR CODE                            #
        #######################################################################
    print(dists.shape)
    return dists

def compute_distances_no_loops(self, X):
    """
Compute the distance between each test point in X and each training point
in self.X_train using no explicit loops.

Input / Output: Same as compute_distances_two_loops
"""
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #########################################################################
    # TODO:                                                                 #
    # Compute the l2 distance between all test points and all training      #
    # points without using any explicit loops, and store the result in      #
    # dists.                                                                #
    #                                                                       #
    # You should implement this function using only basic array operations; #
    # in particular you should not use functions from scipy.                #
    #                                                                       #
    # HINT: Try to formulate the l2 distance using matrix multiplication    #
    #       and two broadcast sums.                                         #
    #########################################################################
    dists = np.sqrt(-2 * np.dot(X, self.X_train.T) +
                    np.sum(np.square(self.X_train), axis=1) +
                    np.sum(np.square(X), axis=1)[:, np.newaxis])
    print(dists.shape)
    #########################################################################
    #                         END OF YOUR CODE                              #
    #########################################################################
    return dists

You can find a full working testable code here.

Do you know what am I doing wrong in compute_distances_no_loops, or wherever?

UPDATE:

The code that throws the error message is:

dists_two = classifier.compute_distances_no_loops(X_test)

# check that the distance matrix agrees with the one we computed before:
difference = np.linalg.norm(dists - dists_two, ord='fro')
print('Difference was: %f' % (difference, ))
if difference < 0.001:
    print('Good! The distance matrices are the same')
else:
    print('Uh-oh! The distance matrices are different')

And the error message:

Difference was: 372100.327569
Uh-oh! The distance matrices are different

Upvotes: 10

Views: 12924

Answers (5)

Guillem
Guillem

Reputation: 2647

I think that you are looking for the pairwise distance.

There is an amazing trick to do that in a single line. You have to cleverly play with the broadcasting:

X_train = np.expand_dims(self.X_train, 1) # shape: [num_train, 1, D]
X_test = np.expand_dims(X, 0) # shape: [1, num_tests, D]
dists = np.square(X_train - X_test) # Thanks to broadcast [num_train, num_tests, D]
dists = np.sqrt(np.sum(dists, axis=-1)) # [num_train, num_tests]

Upvotes: 0

Seungjin Yang
Seungjin Yang

Reputation: 1

I believe the issue comes from inconsistent array shapes.

#a^2 matrix (500, 1) 
alpha = np.sum(np.square(X), axis=1)
alpha = alpha.reshape(len(alpha), 1)
print(alpha.shape)
#b^2 matrix (1, 5000) 
beta  = np.sum(np.square(self.X_train.T), axis=0)
beta = beta.reshape(1, len(beta))
print(beta.shape)
#ab matrix (500, 5000)
alphabeta = np.dot(X, self.X_train.T)
print(alphabeta.shape)

dists = np.sqrt(-2 * alphabeta + alpha + beta)

Upvotes: 0

tqbinh
tqbinh

Reputation: 1

This is my solution for function compute_distances_no_loops() that the OP asked for. I don't use the sqrt() function for performance reason:

def compute_distances_no_loops(self, X):
    num_test = X.shape[0]
    num_train = self.X_train.shape[0]
    dists = np.zeros((num_test, num_train))
    #---------------
    # Get square of X and X_train
    X_sq = np.sum(X**2, axis=1, keepdims=True) 
    Xtrain_sq = np.sum(self.X_train**2, axis=1, keepdims=True)
    # Calculate (squared) dists as (X_train - X)**2 = X_train**2 - 2*X_train*X + X**2
    dists = -2*X.dot(self.X_train.T) + X_sq + Xtrain_sq.T
    #---------------  
    return dists

Upvotes: 0

Hakan Akg&#252;n
Hakan Akg&#252;n

Reputation: 927

It's a late response, but I've solved it in a different way and wanted to post it. When I was solving this, I was unaware of the numpy's column-row vector subtractions from the matrix. As it turns out, we can subtract nx1 or 1xm vector from nxm and when we do it, subtracts from every row-column vector. If one works with a library that doesn't support this kind of behavior, he/she can use mine. For this situation, I've worked it out the math, and the result is the following one:

sum_x_train=np.sum(self.X_train**2,axis=1, keepdims=True)
sum_x_test=np.sum(X**2,axis=1, keepdims=True)
sum_2x_tr_te=np.dot(self.X_train,X.T)*2
sum_x_train=np.dot(sum_x_train,np.ones((1,X.shape[0])))
sum_x_test=np.dot(sum_x_test,np.ones((1,self.X_train.shape[0])))
dists=np.sqrt(sum_x_test.T+sum_x_train-sum_2x_tr_te).T

The downside of this approach is it uses more memory.

Upvotes: 0

Viktoriya Malyasova
Viktoriya Malyasova

Reputation: 1425

Here is how you can compute pairwise distances between rows of X and Y without creating any 3-dimensional matrices:

def dist(X, Y):
    sx = np.sum(X**2, axis=1, keepdims=True)
    sy = np.sum(Y**2, axis=1, keepdims=True)
    return np.sqrt(-2 * X.dot(Y.T) + sx + sy.T)

Upvotes: 13

Related Questions