Reputation:
function [J, grad] = costFunction(theta, X, y)
data = load('ex2data1.txt');
y = data(:, 3);
theta = [1;1;2];
m = length(y);
one = ones(m,1);
X1 = data(:, [1, 2]);
X = [one X1];
J = 0;
grad = zeros(size(theta));
J= 1/m *((sum(-y*log(sigmoid(X*theta)))) - (sum(1-y * log(1 - sigmoid(X*theta)))));
for i = 1:m
grad = (1/m) * sum (sigmoid(X*theta) - y')*X;
end
end
I want to know if i implemented the cost function and gradient descent correctly i am getting NaN answer though this and does theta(1) always have to be 0 i have it as 1 here. How many iterations i need for grad that should be equal to the length of matrix or something else?
Upvotes: 3
Views: 2262
Reputation: 113
function [J, grad] = costFunction(theta, X, y)
m = length(y);
J = 0;
grad = zeros(size(theta));
sig = 1./(1 + (exp(-(X * theta))));
J = ((-y' * log(sig)) - ((1 - y)' * log(1 - sig)))/m;
grad = ((sig - y)' * X)/m;
end
where
sig = 1./(1 + (exp(-(X * theta))));
is matrix representation of the logistic regression hypothesis which is defined as:
where function g is the sigmoid function. The sigmoid function is defined as:
J = ((-y' * log(sig)) - ((1 - y)' * log(1 - sig)))/m;
is matrix representation of the cost function in logistic regression :
and
grad = ((sig - y)' * X)/m;
is matrix representation of the gradient of the cost which is a vector of the same length as θ where the jth element (for j = 0,1,...,n) is defined as follows:
Upvotes: 2