Reputation: 35938
I am getting the error below
found : org.apache.spark.sql.Dataset[(Double, Double)]
required: org.apache.spark.rdd.RDD[(Double, Double)]
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)
On the following code:
val testScoreAndLabel = testResults.
select("Label","ModelProbability").
map{ case Row(l:Double,p:Vector) => (p(1),l) }
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel)
From the error it seems that testScoreAndLabel
is of type sql.Dataset
but BinaryClassificationMetrics
expects an RDD
.
How can I convert a sql.Dataset
into an RDD
?
Upvotes: 2
Views: 2028
Reputation: 35404
I'd do something like this
val testScoreAndLabel = testResults.
select("Label","ModelProbability").
map{ case Row(l:Double,p:Vector) => (p(1),l) }
Now convert testScoreAndLabel
to RDD just by doing testScoreAndLabel.rdd
val testMetrics = new BinaryClassificationMetrics(testScoreAndLabel.rdd)
Upvotes: 4