Reputation: 643
How to externally add dependent jars when you are submitting a Spark Job. Also would like to know How to package dependent jars with application Jar.
Upvotes: 2
Views: 2000
Reputation: 8996
This is a popular question, I looked for some good answer in stackoverflow but I didn't find something that answers this exactly as asked, so I will try to answer this here:
The best way to submit a job is to use the spark-submit
script. This assume that you already have a running cluster (distributed or locally, doesn't matter).
You can find this script under $SPARK_HOME/bin/spark-submit
.
Here is an example:
spark-submit --name "YourAppNameHere" --class com.path.to.main --master spark://localhost:7077 --driver-memory 1G --conf spark.executor.memory=4g --conf spark.cores.max=100 theUberJar.jar
You give the app a name, you define where your main class is located and the location of spark master (where the cluster runs). You can optionally pass different parameters. The last argument is the name of the uberJar that contains your main and all your dependencies.
The theUberJar.jar relates to your second question on how to package your app. If you are using Scala the best way is to use sbt and create an uber jar using sbt-assembly.
Here are the steps:
sbt assembly
$SPARK_HOME/sbin/start-all.sh)
Upvotes: 1