Why KMeansModel.predict error has started to appear since Spark 1.0.1.?

Question

I work with Scala (2.10.4 version) and Spark - I have moved to Spark 1.0.1. version and noticed one of my scripts is not working correctly now. It uses k-means method from the MLlib library in the following manner.

Assume I have a KMeansModel object named clusters:

scala> clusters.toString
res8: String = org.apache.spark.mllib.clustering.KMeansModel@689eab53

Here is my method in question and an error I receive while trying to compile it:

scala> def clustersSize(normData: RDD[Array[Double]]) = {
 |   normData.map(r => clusters.predict(r))
 | }

:28: error: overloaded method value predict with alternatives:
  (points: org.apache.spark.api.java.JavaRDD[org.apache.spark.mllib.linalg.Vector])org.apache.spark.api.java.JavaRDD[Integer] 
  (points: org.apache.spark.rdd.RDD[org.apache.spark.mllib.linalg.Vector])org.apache.spark.rdd.RDD[Int] 
  (point: org.apache.spark.mllib.linalg.Vector)Int
 cannot be applied to (Array[Double])
     normData.map(r => clusters.predict(r))

The KMeansModel documentation clearly says that the predict function needs an argument of Array[Double] type and I think I do put (don't I?) an argument of such type to it. Thank you in advance for any suggestions on what am I doing wrong.

Joe Pallas · Accepted Answer

You're using Spark 1.0.1 but the documentation page you cite is for 0.9.0. Check the current documentation and you'll see that the API has changed. See the migration guide for background.

Why KMeansModel.predict error has started to appear since Spark 1.0.1.?

Answers (1)

Related Questions