Ifan 767
Ifan 767

Reputation: 512

How to write cost function formula from Andrew Ng assignment in Octave?

My implementation (see below) gives the scalar value 3.18, which is not the right answer. The value should be 0.693. Where does my code deviate from the equation?

Here are the instructions to solve for the data to run the cost function method in Octave:

data = load('ex2data1.txt');
X = data(:, [1, 2]); y = data(:, 3);
[m, n] = size(X);
X = [ones(m, 1) X];
initial_theta = zeros(n + 1, 1);
[cost, grad] = costFunction(initial_theta, X, y);

Here is the link on ex2data, in this package there is data: data link.

The formula for the cost function is formula

Here is the code I am using:

function [J, grad] = costFunction(theta, X, y)

m = length(y); % number of training examples

% You need to return the following variables correctly 
J = 0; %#ok<NASGU>
grad = zeros(size(theta)); %#ok<NASGU>

hx = sigmoid(X * theta)';
m = length(X);

J = sum(-y' * log(hx) - (1 - y')*log(1 - hx)) / m;

grad = X' * (hx - y) / m;

end

Here is the sigmoid function:

function g = sigmoid(z)
g = 1/(1+exp(-z));
end

Upvotes: 3

Views: 2574

Answers (2)

Prikshit Setia
Prikshit Setia

Reputation: 31

This is the code for the sigmoid function which I think you have made mistake in:

function g = sigmoid(z)
   g = zeros(size(z));
   temp=1+exp(-1.*z);
   g=1./temp;
end


function [J, grad] = costFunction(theta, X, y)
   m = length(y); 
   J = 0;
   grad = zeros(size(theta));
   h=X*theta;
   xtemp=sigmoid(h);
   temp1=(-y'*log(xtemp));
   temp2=(1-y)'*log(1-xtemp);
   J=1/m*sum(temp1-temp2);
   grad=1/m*(X'*(xtemp-y));
end

And I think it should be (1-y)' as shown in temp2=(1-y)'

Upvotes: 2

rayryeng
rayryeng

Reputation: 104474

Your sigmoid function is incorrect. The incoming data type is a vector but the operations you are using are performing matrix division. This needs to be element-wise.

function g = sigmoid(z)
    g = 1.0 ./ (1.0 + exp(-z));
end

By doing 1 / A where A is an expression, you are in fact compute the inverse of A Since inverses only exist for square matrices, this will compute the pseudo-inverse which is definitely not what you want.

You can keep most of your costFunction code the same as you're using the dot product. I would get rid of the sum since that is implied with the dot product. I'll mark my changes with comments:

function [J, grad] = costFunction(theta, X, y)

m = length(y); % number of training examples

% You need to return the following variables correctly 
%J = 0; %#ok<NASGU> <-- Don't need to declare this as you'll create the variables later
%grad = zeros(size(theta)); %#ok<NASGU>

hx = sigmoid(X * theta);  % <-- Remove transpose
m = length(X);

J = (-y' * log(hx) - (1 - y')*log(1 - hx)) / m; % <-- Remove sum

grad = X' * (hx - y) / m;

end

Upvotes: 1

Related Questions