Reputation: 862
I am using Weka for classification using LibSVM classifier, and wanted some help related to the outputs that I get from the evaluation model.
In the below example, my test.arff file contains 1000 instances, and I want to know the probability with which each instance is classified as yes/ no (It's a simple two class problem).
For instance, for instance 1, if it is classified as 'yes', then with what probability is it classified so, is something which I am looking for.
Below is the code snippet that I have currently:
// Read and load the Training ARFF file
ArffLoader trainArffLoader = new ArffLoader();
trainArffLoader.setFile(new File("train_clusters.arff"));
Instances train = trainArffLoader.getDataSet();
train.setClassIndex(train.numAttributes() - 1);
System.out.println("Loaded Train File");
// Read and load the Test ARFF file
ArffLoader testArffLoader = new ArffLoader();
testArffLoader.setFile(new File("test_clusters.arff"));
Instances test = testArffLoader.getDataSet();
test.setClassIndex(test.numAttributes() - 1);
System.out.println("Loaded Test File");
LibSVM libsvm = new LibSVM();
libsvm.buildClassifier(train);
// Evaluation
Evaluation evaluation = new Evaluation(train);
evaluation.evaluateModel(libsvm, test);
System.out.println(evaluation.toSummaryString("\nPrinting the Results\n=====================\n", true));
System.out.println(evaluation.toClassDetailsString());
Upvotes: 0
Views: 479
Reputation: 4749
You should use libsvm.distributionForInstance
method. It returns probability estimate for each class index (for 2 in your cases).
For example, to print all estimates for each instance from test set use something like this:
for (Instance instance : test) {
double[] distribution = libsvm.distributionForInstance(instance);
for (int classIndex : classIndices) {
System.out.print(distribution[classIndex] + " ");
}
System.out.println();
}
Note that it is not true probability, but estimations made by Platt's method (see the question).
Upvotes: 1