Reputation: 311
I've written an implementation in Python using NumPy of vectorized regularized Gradient descent for logistic regression. I've used a numerical check method to check that my implementation is correct. The numerical check verifies my implementation of Linear regression GD, but Logisitc fails, and I cannot find out. Any help would be appreciated. So here goes:
Those are my methods for calculating cost and gradient (update function calculates gradient and updates the parameters):
@staticmethod
def _hypothesis(parameters, features):
return Activation.sigmoid(features.dot(parameters))
@staticmethod
def _cost_function(parameters, features, targets):
m = features.shape[0]
return np.sum(-targets * (np.log(LogisticRegression._hypothesis(parameters, features)) - (1 - targets) * (
np.log(1 - LogisticRegression._hypothesis(parameters, features))))) / m
@staticmethod
def _update_function(parameters, features, targets, extra_param):
regularization_vector = extra_param.get("regularization_vector", 0)
alpha = extra_param.get("alpha", 0.001)
m = features.shape[0]
return parameters - alpha / m * (
features.T.dot(LogisticRegression._hypothesis(parameters, features) - targets)) + \
(regularization_vector / m) * parameters
The cost function doesn't have regularization included, but the test I do is with a regularization vector equal to zero so it does not matter. How I am testing:
def numerical_check(features, parameters, targets, cost_function, update_function, extra_param, delta):
gradients = - update_function(parameters, features, targets, extra_param)
parameters_minus = np.copy(parameters)
parameters_plus = np.copy(parameters)
parameters_minus[0, 0] = parameters_minus[0, 0] + delta
parameters_plus[0, 0] = parameters_plus[0, 0] - delta
approximate_gradient = - (cost_function(parameters_plus, features, targets) -
cost_function(parameters_minus, features, targets)) / (2 * delta) / parameters.shape[0]
return abs(gradients[0, 0] - approximate_gradient) <= delta
Basically, I am manually calculating the gradient when I shift the first parameter delta amount to the left and to the right. And then I compare it with the gradients I get from the update function. I am using initial parameters equal to 0 so the updated parameter received is equal to the gradient divided by and the number of features. Also alpha is equal to one. Unfortunately, I am getting different values from the two methods and I cannot find out why. Any advice on how to troubleshoot this problem would be really appreciated.
Upvotes: 4
Views: 247
Reputation: 885
there is an error in your cost function. error is due to invalid distribution of brackets. i've fixed that
def _cost_function(parameters, features, targets):
m = features.shape[0]
return -np.sum(
( targets) * (np.log( LogisticRegression._hypothesis(parameters, features)))
+ (1 - targets) * (np.log(1 - LogisticRegression._hypothesis(parameters, features)))
) / m
try writing your code cleanly, it helps to detect errors like these
Upvotes: 3
Reputation: 475
I think I spotted a possible error in your code, tell me if this is true.
In your numerical_check
function you are calling the update_function
to initialize the gradient
. However, in your _update_function
above, you aren't not actually returning the gradients but your are returning the updated value of the parameters
.
That is, notice the return statement of your _update_function
is this :
return parameters - alpha / m * (
features.T.dot(LogisticRegression._hypothesis(parameters, features) - targets)) + \
(regularization_vector / m) * parameters
What I would like to advise you and what I do in my ML algorithms is make a separate function for calculating gradients for e.g.
def _gradient(features, parameters, target):
m = features.shape[0]
return features.T.dot(LogisticRegression._hypothesis(parameters, features) - targets)) / m
And then change your numerical_check
function to initialize the gradient
as follows :
gradient = _gradient(features, parameters, target)
I hope this solves your problem.
Upvotes: 2