Reputation: 35
I was solving a exercise of a online course form coursera on machine learning. The problem statement is :
Suppose that a high school has a dataset representing 40 students who were admitted to college and 40 students who were not admitted. Each ( x(i), y(i) )
training example contains a student's score on two standardized exams and a label of whether the student was admitted.
Our task is to build a binary classification model that estimates college admission chances based on a student's scores on two exams. In the training data,
a. The first column of your x
array represents all Test 1 scores, and the second column represents all Test 2 scores.
b. The y
vector uses '1' to label a student who was admitted and '0' to label a student who was not admitted.
I have solved it by using predefined function named fminunc
. Now , i am solving it by using gradient descent but my graph of cost vs number of iteration is not conversing i.e cost function value is not decreasing with number of iteration . My theta value is also not matching with the answer that should i get.
theta value that i got :
[-0.085260 0.047703 -0.022851]
theta value that i should get (answer) :
[-16.38 0.1483 0.1589]
My source code :
clear ; close all; clc
x = load('/home/utlesh/Downloads/ex4x.txt');
y = load('/home/utlesh/Downloads/ex4y.txt');
theta = [0,0,0];
alpha = 0.00002;
a = [0,0,0];
m = size(x,1);
x = [ones(m,1) x];
n = size(x,2);
y_hyp = y*ones(1,n);
for kk = 1:100000
hyposis = 1./(1 + exp(-(x*theta')));
x_hyp = hyposis*ones(1,n);
theta = theta - alpha*1/m*sum((x_hyp - y_hyp).*x);
a(kk,:) = theta ;
end
cost = [0];
for kk = 1:100000
h = 1./(1 + exp(-(x*a(kk,:)')));
cost(kk,:) = sum(-y .* log(h) - (1 - y) .* log(1 - h));
end
x_axis = [0];
for kk = 1:100000
x_axis(kk,:) = kk;
end
plot(x_axis,cost);
The graph that i got looks like that of 1/x;
Please tell me where i am doing mistake . If there is anything that i misunderstood please let me know .
Upvotes: 0
Views: 468
Reputation: 159
What I can see missing is the usage of learning rate and weights. The weights can be adjusted in two modes online and batch.
The weights should be randomly assigned values between [-0.01,0.01]. I did an exercise as a part of my HW during my Master's. Below is the snippet:
assign values to weights between [-0.01,0.01] i.e. no. of weight values will be, no. of features + 1:
weights = -.01 + 0.02 * rand(3,1);
learnRate = 0.001;
Here running the code for set number of iterations: (It converged in 100 iterations also).
while iter < 100
old_output = new_output;
delta = zeros(cols-1,1);
for t = 1:rows
input = 0;
for j = 1:cols-1
input = input + weights(j) * numericdata(t,j);
end
new_output(t) = (1 ./ (1 + exp(-input)));
for j = 1:cols-1
delta(j) = delta(j) + (numericdata(t,4)-new_output(t)) * numericdata(t,j);
end
end
#Adjusting weights (Batch Mode):
for j=1:cols-1
weights(j) = weights(j) + learnRate * (delta(j));
end
error = abs(numericdata(:,4) - new_output);
errorStr(i) = (error(:));
error = 0;
iter = iter + 1;
i = i + 1;
end
Also, I had a talk with my professor, while studying it. He said, if the dataset given has the property to converge then you will see that when you randomly run it for different number of iterations.
Upvotes: 0