Reputation: 1285
So I'm trying to evaluate the performance of a classifier on a test instance and when I try to get the area under the ROC, the following error is thrown:
Java.lang.NullPointerException at weka.classifiers.evaluation.ThresholdCurve.getROCArea(ThresholdCurve.java:268) at weka.classifiers.Evaluation.areaUnderROC(Evaluation.java:382) at Classifier_Search.runAda(Classifier_Search.java:74) at Classifier_Search.acrossTest(Classifier_Search.java:142) at Classifier_Search.main(Classifier_Search.java:511) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at edu.rice.cs.drjava.model.compiler.JavacCompiler.runCommand(JavacCompiler.java:271)
The code that is throwing the error is this:
Evaluation eval = new Evaluation(train);
String[] options = {"-P 100", "-S 1", "-I " + it, "-W weka.classifiers.trees.DecisionStump"};
AdaBoostM1 cls = new AdaBoostM1();
cls.setOptions(options);
cls.buildClassifier(train);
eval.evaluateModel(cls, test);
int index = test.classIndex();
return eval.areaUnderROC(index);
When I look up the javadoc for Evaluation (javadoc for Evaluation), It says that areaUnderROC needs to be set by an evaluateClassifier method. No such method exists. Other similar methods (such as falsePositive) work just fine. Has anyone encountered this problem? I can't find anything on OldNabble (Weka's help site).
Thanks!
EDIT: to clarify, test and train are both Instances objects that were created with the following code:
private static Instances readFile(File filename) throws IOException
{
CSVLoader loader = new CSVLoader();
loader.setSource(filename);
Instances data = loader.getDataSet();
data.setClassIndex(data.numAttributes() - 1);
return data;
}
They are read from .csv files. Typically if there is something wrong with test or train, the error is thrown here.
Upvotes: 3
Views: 1310
Reputation: 1285
Once again, I have the answer to my own question. If someone disagrees with this answer, please let me know. The weka documentation for areaUnderROC, which already has one typo (it references a method, evaluateClassifier, which does not exist), has led me in the wrong direction. I think there is another error in the form of a misleading explanation. areaUnderROC works (without throwing the exception) for two values: 0 and 1. So, rather than taking the class index (the index of the attribute I am using as the class in the Instances object), what it actually wants is which class (of the two classes) to consider as positive. Given the variable names in the documentation, I think it's reasonable to not understand this at first glance. I also think that since its explanation is in parallel with that of methods that do take the class index (rather than 0 or 1), it is also misleading.
Upvotes: 3