Reputation: 1
Just tried the MulticlassMetrics feature in Spark MLlib 1.3.1 with a simple generic (label,predicition) input
(label, predicition)
( 1.0 , 1.0)
( 2.0 , 2.0)
( 3.0 , 3.0)
( 4.0 , 3.0)
( 4.0 , 4.0)
( 4.0 , 4.0)
and I get ( Scala code snippet shown )
labelsAndPredictions.foreach(println)
val metrics = new MulticlassMetrics(labelsAndPredictions)
println("confusionMatrix: ")
println(metrics.confusionMatrix)
println("Precision: ")
metrics.labels.foreach( x => println(x.toInt + " " + metrics.precision(x.toInt)) )
println("Recall: ")
metrics.labels.foreach( x => println(x.toInt + " " + metrics.recall(x.toInt)) )
the precision result values
precision:
1 1.0
2 1.0
3 1.0
4 0.6666666666666666
which seems to be at odds with what one would expect:
1 1.0
2 1.0
3 0.5
4 1.0
Precision: Given all the predicted labels (for a given class X), how many instances were correctly predicted? ( see more at: http://www.text-analytics101.com/2014/10/computing-precision-and-recall-for.html#sthash.OTmBn0Vb.dpuf)
So for class label 4 I would expect
prec(4) = 1.0 (2 out of 2 are correct)
and for class label 3 I would expect
prec(3) = 0.5 (1 out of 2 are correct).
If I call MLlib recall() on the same data set, I get the expected (correct) result for precision.
Could it be that precision() and recall() in MLlib are currently incorrectly interchanged?
Any input,comment would be greatly appreciated. Thanks!
Upvotes: 0
Views: 471
Reputation: 11
The problem is that MulticlassMetrics
expects predictionAndLabels
, an RDD of (prediction, label)
pairs. You've got it the other way around, which is why the precision and recall are flipped.
Upvotes: 0