ATP
ATP

Reputation: 862

Finding the probability with which an instance in classified in Weka

I am using Weka for classification using LibSVM classifier, and wanted some help related to the outputs that I get from the evaluation model.

In the below example, my test.arff file contains 1000 instances, and I want to know the probability with which each instance is classified as yes/ no (It's a simple two class problem).

For instance, for instance 1, if it is classified as 'yes', then with what probability is it classified so, is something which I am looking for.

Below is the code snippet that I have currently:

            // Read and load the Training ARFF file 
        ArffLoader trainArffLoader = new ArffLoader();
        trainArffLoader.setFile(new File("train_clusters.arff"));
        Instances train = trainArffLoader.getDataSet();
        train.setClassIndex(train.numAttributes() - 1);
        System.out.println("Loaded Train File");

        // Read and load the Test ARFF file 
        ArffLoader testArffLoader = new ArffLoader();
        testArffLoader.setFile(new File("test_clusters.arff"));
        Instances test = testArffLoader.getDataSet();
        test.setClassIndex(test.numAttributes() - 1);
        System.out.println("Loaded Test File");


        LibSVM libsvm = new LibSVM();

        libsvm.buildClassifier(train);

        // Evaluation
        Evaluation evaluation = new Evaluation(train);
        evaluation.evaluateModel(libsvm, test);
        System.out.println(evaluation.toSummaryString("\nPrinting the Results\n=====================\n", true));
        System.out.println(evaluation.toClassDetailsString());

Upvotes: 0

Views: 479

Answers (1)

Nikita Astrakhantsev
Nikita Astrakhantsev

Reputation: 4749

You should use libsvm.distributionForInstance method. It returns probability estimate for each class index (for 2 in your cases).

For example, to print all estimates for each instance from test set use something like this:

    for (Instance instance : test) {
        double[] distribution = libsvm.distributionForInstance(instance);
        for (int classIndex : classIndices) {
            System.out.print(distribution[classIndex] + " ");
        }
        System.out.println();
    }

Note that it is not true probability, but estimations made by Platt's method (see the question).

Upvotes: 1

Related Questions