Ashish Nijai
Ashish Nijai

Reputation: 331

Submitting application on Spark Cluster using spark submit

I am new to Spark. I want to run a Spark Structured Streaming application on cluster. Master and workers has same configuration.

I have few queries for submitting app on cluster using spark-submit: You may find them comical or strange.

  1. How can I give path for 3rd party jars like lib/*? ( Application has 30+ jars)
  2. Will Spark automatically distribute application and required jars to workers?
  3. Does it require to host application on all the workers?
  4. How can i know status of my application as I am working on console.

I am using following script for Spark-submit.

   spark-submit 
  --class <class-name> 
  --master spark://master:7077 
  --deploy-mode cluster 
  --supervise 
  --conf spark.driver.extraClassPath <jar1, jar2..jarn> 
  --executor-memory 4G 
  --total-executor-cores 8 
 <running-jar-file>

But code is not running as per expectation. Am i missing something?

Upvotes: 0

Views: 1224

Answers (3)

user73478
user73478

Reputation: 11

You can make a fat jar containing all dependencies. Below link helps you understand that.

https://community.hortonworks.com/articles/43886/creating-fat-jars-for-spark-kafka-streaming-using.html

Upvotes: -1

Soheil Pourbafrani
Soheil Pourbafrani

Reputation: 3427

To pass multiple jar file to Spark-submit you can set the following attributes in file SPARK_HOME_PATH/conf/spark-defaults.conf (create if not exists):

Don't forget to use * at the end of the paths

spark.driver.extraClassPath /fullpath/to/jar/folder/*
spark.executor.extraClassPath /fullpathto/jar/folder/*

Spark will set the attributes in the file spark-defaults.conf when you use the spark-submit command. Copy your jar file on that directory and when you submit your Spark App on the cluster, the jar files in the specified paths will be loaded, too.

spark.driver.extraClassPath: Extra classpath entries to prepend to the classpath of the driver. Note: In client mode, this config must not be set through the SparkConf directly in your application, because the driver JVM has already started at that point. Instead, please set this through the --driver-class-path command line option or in your default properties file.

Upvotes: 1

Jungtaek Lim
Jungtaek Lim

Reputation: 1708

--jars will transfer your jar files to worker nodes, and become available in both driver and executors' classpaths.

Please refer below link to see more details.

http://spark.apache.org/docs/latest/submitting-applications.html#advanced-dependency-management

Upvotes: 0

Related Questions