Reputation: 6724
Using Spark v1.0-rc3 - When implementing MLlib's linear regression I get an error. So eventually I tried copy/pasting from Spark's MLlib example code for linear regression in Scala and I still receive the error:
scala> val parsedData = data.map { line =>
val parts = line.split(',')
LabeledPoint(parts(0).toDouble, parts(1).split(' ').map(x => x.toDouble).toArray)
}
<console>:28: error: polymorphic expression cannot be instantiated to expected type;
found : [U >: Double]Array[U]
required: org.apache.spark.mllib.linalg.Vector
LabeledPoint(parts(0).toDouble, parts(1).split(' ').map(x => x.toDouble).toArray)
The error states that org.apache.spark.mllib.linalg.Vector
is required, but importing it does not help. Even when trying multiple methods of casting to a Vector I get
<console>:19: error: type mismatch;
found : scala.collection.immutable.Vector[Array[Double]]
Upvotes: 1
Views: 2523
Reputation: 6724
The problem is due to changes to the later version. The code that once worked in v0.91 now requires tweaking for v1.0. You can find the latest docs here The solution is to add Vectors not Vector despite what the error tells you. Try:
import org.apache.spark.mllib.regression.LinearRegressionWithSGD
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors
// Load and parse the data
val data = sc.textFile("mllib/data/ridge-data/lpsa.data")
val parsedData = data.map { line =>
val parts = line.split(',')
LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(x => x.toDouble)))
}
Upvotes: 3