Reputation: 37
I have a word count program on eclipse using Maven and Scala. After exporting the jar file and trying to run it on terminal (on Ubuntu) I got an unexpected result
My path spark is
home/amel/spark
my path hadoop is
/usr/local/hadoop
my commands are su hadoopusr //i enter my password then i enter this commande start-all.sh then i enter my spark file where the jar has been save and i run this command
spark-submit --class bd.spark_app.first.wordcount --master yarn --
master local[2] SparkExample.jar
r
this is the code i have on eclipse( i am using a maven project with scala ide) package bd.spark_app
import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.rdd.RDD.rddToOrderedRDDFunctions
object first {
def main ( args:Array[String] ) ={
val conf = new SparkConf().setMaster("local").setAppName("wordcount")
val sc = new SparkContext(conf)
val sampledata = sc.textFile("/home/hadoopusr/sampledata")
val result = sampledata.flatMap(_.split(" ")).map(words => (words,
1)).reduceByKey(_+_)
result.collect.foreach(println)
result.saveAsTextFile("outputfile")
sc.stop()
}
}
I expected this result
(me,4)
(you,3)
(food,2)
(cat,1)
Upvotes: 2
Views: 5480
Reputation: 900
spark-submit --class bd.spark_app.first.wordcount --master yarn --
master local[2] SparkExample.jar
this command is wrong there are 2 masters one is local and another is yarn.
second thing is you SparkExample.jar
is NOT there in the path where you are trying to execute spark-submit thats the reason classnot found exception.
please correct all these. please refer https://spark.apache.org/docs/latest/submitting-applications.html
Upvotes: 1