Reputation: 321
I have implemented Logistic Regression with Gradient Descent in Java. It doesn't seem to work well (It does not classify records properly; the probability of y=1 is a lot.) I don't know whether my implementation is correct.I have gone through the code several times and i am unable to find any bug. I have been following Andrew Ng's tutorials on Machine learning on Course Era. My Java implementation has 3 classes. namely :
This is my cost function:
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Collections;
import java.util.List;
public class Logistic {
/** the learning rate */
private double alpha;
/** the weight to learn */
private double[] theta;
/** the number of iterations */
private int ITERATIONS = 3000;
public Logistic(int n) {
this.alpha = 0.0001;
theta = new double[n];
}
private double sigmoid(double z) {
return (1 / (1 + Math.exp(-z)));
}
public void train(List<Instance> instances) {
double[] temp = new double[3];
//Gradient Descent algorithm for minimizing theta
for(int i=1;i<=ITERATIONS;i++)
{
for(int j=0;j<3;j++)
{
temp[j]=theta[j] - (alpha * sum(j,instances));
}
//simulataneous updates of theta
for(int j=0;j<3;j++)
{
theta[j] = temp[j];
}
System.out.println(Arrays.toString(theta));
}
}
private double sum(int j,List<Instance> instances)
{
double[] x;
double prediction,sum=0,y;
for(int i=0;i<instances.size();i++)
{
x = instances.get(i).getX();
y = instances.get(i).getLabel();
prediction = classify(x);
sum+=((prediction - y) * x[j]);
}
return (sum/instances.size());
}
private double classify(double[] x) {
double logit = .0;
for (int i=0; i<theta.length;i++) {
logit += (theta[i] * x[i]);
}
return sigmoid(logit);
}
public static void main(String... args) throws FileNotFoundException {
//DataSet is a class with a static method readDataSet which reads the dataset
// Instance is a class with two members: double[] x, double label y
// x contains the features and y is the label.
List<Instance> instances = DataSet.readDataSet("data.txt");
// 3 : number of theta parameters corresponding to the features x
// x0 is always 1
Logistic logistic = new Logistic(3);
logistic.train(instances);
//Test data
double[]x = new double[3];
x[0]=1;
x[1]=45;
x[2] = 85;
System.out.println("Prob: "+logistic.classify(x));
}
}
Can anyone tell me what am I doing wrong? Thanks in advance! :)
Upvotes: 3
Views: 3010
Reputation: 14209
As I am studying logistic regression, I took the time to review your code in detail.
TLDR
In fact, it appears the algorithm is correct.
The reason you had so much false negatives or false positives is, I think, because of the hyper parameters you choose.
The model was under-trained so the hypothesis was under-fitting.
Details
I had to create the DataSet
and Instance
classes because you did not publish them, and set up a train data set and a test data set based on the Cryotherapy dataset.
See http://archive.ics.uci.edu/ml/datasets/Cryotherapy+Dataset+.
Then, using your same exact code (for the logistic regression part) and by choosing an alpha rate of 0.001
and a number of iterations of 100000
, I got a precision rate of 80.64516129032258
percent on the test data set, which is not so bad.
I tried to get a better precision rate by tweaking manualy those hyper parameters but could not obtain any better result.
At this point, an enhancement would be to implement regularization, I suppose.
Gradient descent formula
In Andrew Ng's video about the the cost function and gradient descent, it is correct that the 1/m
term is omitted.
A possible explanation is that the 1/m
term is included in the alpha
term.
Or maybe it's just an oversight.
See https://www.youtube.com/watch?v=TTdcc21Ko9A&index=36&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN&t=6m53s at 6m53s.
But if you watch Andrew Ng's video about regularization and logistic regression you'll notice that the term 1/m
is clearly present in the formula.
See https://www.youtube.com/watch?v=IXPgm1e0IOo&index=42&list=PLLssT5z_DsK-h9vYZkQkYNWcItqhlRJLN&t=2m19s at 2m19s.
Upvotes: 1