How to vectorize hinge loss gradient computation

Question

I'm computing thousands of gradients and would like to vectorize the computations in Python. The context is SVM and the loss function is Hinge Loss. Y is Mx1, X is MxN and w is Nx1.

 L(w) = lam/2 * ||w||^2 + 1/m Sum i=1:m ( max(0, 1-y[i]X[i]w) )

The gradient of this is

grad = lam*w + 1/m Sum i=1:m {-y[i]X[i].T if y[i]*X[i]*w < 1, else 0}

Instead of looping through each element of the sum and evaluating the max function, is it possible to vectorize this? I want to use something like np.where like the following

grad = np.where(y*X.dot(w) < 1, -X.T.dot(y), 0)

This does not work because where the condition is true, -X.T*y is the wrong dimension.

edit: list comprehension version, would like to know if there's a cleaner or more optimal way

def grad(X,y,w,lam):
    # cache y[i]*X[i].dot(w), each row of Xw is multiplied by a single element of y
    yXw = y*X.dot(w)
    # cache y[i]*X[i], note each row of X is multiplied by a single element of y
    yX = X*y[:,np.newaxis]
    # return the average of this max function
    return lam*w + np.mean( [-yX[i] if yXw[i] < 1 else 0 for i in range(len(y))] )

How to vectorize hinge loss gradient computation

Answers (1)

Related Questions