Reputation: 6388
I have a basic spark mllib program as follows.
import org.apache.spark.mllib.clustering.KMeans
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
import org.apache.spark.mllib.linalg.Vectors
class Sample {
val conf = new SparkConf().setAppName("helloApp").setMaster("local")
val sc = new SparkContext(conf)
val data = sc.textFile("data/mllib/kmeans_data.txt")
val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble))).cache()
// Cluster the data into two classes using KMeans
val numClusters = 2
val numIterations = 20
val clusters = KMeans.train(parsedData, numClusters, numIterations)
// Export to PMML
println("PMML Model:\n" + clusters.toPMML)
}
I have manually added spark-core
, spark-mllib
and spark-sql
to the project class path through intellij all having version 1.5.0.
I am getting the below error when I run the program? any idea what's wrong?
Error:scalac: error while loading Vector, Missing dependency 'bad symbolic reference. A signature in Vector.class refers to term types in package org.apache.spark.sql which is not available. It may be completely missing from the current classpath, or the version on the classpath might be incompatible with the version used when compiling Vector.class.', required by /home/fazlann/Downloads/spark-mllib_2.10-1.5.0.jar(org/apache/spark/mllib/linalg/Vector.class
Upvotes: 0
Views: 801
Reputation: 29
DesirePRG. I have met the same problem as yours. The solution is to import some jar which assemble the spark and hadoop, such as spark-assembly-1.4.1-hadoop2.4.0.jar, then it could work properly.
Upvotes: 1