Reputation: 3
I'm trying to take a gradient of this function: loss function (N - objects, m - features).
def L(y,X,w): #loss function
return np.sum( np.log1p(np.exp(np.dot(w,(-X.T*y)))) )
Here is my calculation of a partial derivative: analytic gradient
def g(y,X,w): #gradient
return (-X.T*y).dot(1-1/(1+(np.exp(np.dot(w,(-X.T*y))))))
When I'm implementing numerical estimation for the gradient, it has a value different from analytical, so I've probably done wrong calculations.
gradient checking:
e = 1e-4
test =(np.array([0.6, -0.2]), #y
np.array([[3,8.5], [1,-5]]), #X
np.array([-1,0.4])) #w
grd = np.ones((ss[1].shape[1],))
loss1 = L(test[0],test[1],test[2][0]-e)
loss2 = L(test[0],test[1],test[2][0]+e)
grd[0] = (loss2-loss1)/(2*e);
loss1 = L(test[0],test[1],test[2][1]-e)
loss2 = L(test[0],test[1],test[2][1]+e)
grd[1] = (loss2-loss1)/(2*e);
print('\ngrd num: ',grd)
print('\ngrd analyt: ',g(test[0],test[1],test[2]))
grd num: [-7.25478847 -1.47346219]
grd analyt: [-0.72164669 -2.59980408]
Where I made a mistake?
Upvotes: 0
Views: 252
Reputation: 973
You have a mistake in your analytic gradient calculation:
def g(y,X,w): #gradient
return (-X.T * y).dot(1 - 1 / (1 + (np.exp(np.dot(w,(-X.T * y))))))
right is a:
def g(y,X,w):
return (-X.T * y).dot(1 / (1 + (np.exp(np.dot(w,(X.T * y))))))
Upvotes: 1