udo_w_schmitt
udo_w_schmitt

Reputation: 1

Does Apache Spark MLlib 1.3.1 correctly compute multi-class precision and recall values?

Just tried the MulticlassMetrics feature in Spark MLlib 1.3.1 with a simple generic (label,predicition) input

(label, predicition)
( 1.0 , 1.0)
( 2.0 , 2.0)
( 3.0 , 3.0)
( 4.0 , 3.0)
( 4.0 , 4.0)
( 4.0 , 4.0)

and I get ( Scala code snippet shown )

    labelsAndPredictions.foreach(println)

    val metrics = new MulticlassMetrics(labelsAndPredictions)
    println("confusionMatrix: ")        
    println(metrics.confusionMatrix)

    println("Precision: ")
    metrics.labels.foreach( x => println(x.toInt + " " + metrics.precision(x.toInt)) )

    println("Recall: ")
    metrics.labels.foreach( x => println(x.toInt + " " + metrics.recall(x.toInt)) )       

the precision result values

precision:

1   1.0
2   1.0
3   1.0
4   0.6666666666666666 

which seems to be at odds with what one would expect:

1   1.0
2   1.0
3   0.5
4   1.0

Precision: Given all the predicted labels (for a given class X), how many instances were correctly predicted? ( see more at: http://www.text-analytics101.com/2014/10/computing-precision-and-recall-for.html#sthash.OTmBn0Vb.dpuf)

So for class label 4 I would expect

prec(4) = 1.0 (2 out of 2 are correct)

and for class label 3 I would expect

prec(3) = 0.5 (1 out of 2 are correct).

If I call MLlib recall() on the same data set, I get the expected (correct) result for precision.

Could it be that precision() and recall() in MLlib are currently incorrectly interchanged?

Any input,comment would be greatly appreciated. Thanks!

Upvotes: 0

Views: 471

Answers (1)

abayesed
abayesed

Reputation: 11

The problem is that MulticlassMetrics expects predictionAndLabels, an RDD of (prediction, label) pairs. You've got it the other way around, which is why the precision and recall are flipped.

See http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib.evaluation.MulticlassMetrics

Upvotes: 0

Related Questions