Reputation: 1265
I am getting error while executing spark job using spark-submit on windows 10 machine. The command is:
c:\workspaces\Spark2Demo>spark-submit --class retail_db.GetRevenuePerOrder --master local .\target\scala-2.12\spark2demo_2.12-0.1.jar c:\workspaces\data\retail_db\orders\part-00000 c:\workspaces\output
The error I get is:
2019-03-12 19:09:33 ERROR SparkContext:91 - Error initializing SparkContext.
org.apache.spark.SparkException: Could not parse Master URL: 'c:\workspaces\data\retail_db\orders\part-00000'
at org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2784)
at org.apache.spark.SparkContext.<init>(SparkContext.scala:493)
at retail_db.GetRevenuePerOrder$.main(GetRevenuePerOrder.scala:7)
at retail_db.GetRevenuePerOrder.main(GetRevenuePerOrder.scala)
The file exists and is accessible. I am able to run the program in IDE. Following is the prorgram:
package retail_db
import org.apache.spark.{SparkConf,SparkContext}
object GetRevenuePerOrder {
def main(args:Array[String]):Unit = {
val conf = new SparkConf().setMaster(args(0)).setAppName("GetRevenuePerOrder")
val sc = new SparkContext(conf)
sc.setLogLevel("DEBUG")
println(args)
val orderItems = sc.textFile(args(1))
val revenuePerOrder = orderItems.map(oi => (oi.split(",")(1).toInt, oi.split(",")(4).toFloat)).reduceByKey(_ + _).map(oi => (oi._1 + "," + oi._2))
revenuePerOrder.saveAsTextFile(args(2))
}
}
Please help.
Upvotes: 1
Views: 3594
Reputation: 1076
You are setting Master twice. First time in the spark-submit command (--master local), where you are setting it as local and the second time in the SparkConf (new SparkConf().setMaster(args(0))). As mentioned in the spark configuration page, "Properties set directly on the SparkConf take highest precedence, then flags passed to spark-submit or spark-shell, then options in the spark-defaults.conf file", the local master set by the spark-submit, gets overwritten by the SparkConf argument. Please remove the second part.
Upvotes: 1