Reputation: 379
I am trying to add my custom jar in spark job using "spark.jars" property. Although I can read the info in logs of jar getting added but when I check the jars that are added to the classpath, I don't find it.Below are the functions that I also have tried it out.
1)spark.jars
2)spark.driver.extraLibraryPath
3)spark.executor.extraLibraryPath
4)setJars(Seq[String])
But none added the jar.I am using spark 2.2.0 in HDP and files were kept locally. Please let me know what possibly I am doing wrong.
First option worked for me.Spark.jars was adding jar as it was being shown in Spark UI.
Upvotes: 1
Views: 6229
Reputation: 8432
If you need an external jar available to the executors, you can try spark.executor.extraClassPath
. According to the documentation it shouldn't be necessary, but it helped me in the past
Extra classpath entries to prepend to the classpath of executors. This exists primarily for backwards-compatibility with older versions of Spark. Users typically should not need to set this option.
Documentation: https://spark.apache.org/docs/latest/configuration.html
Upvotes: 1
Reputation: 2232
Check the documentation for submitting jobs, adding extra non-runtime jars is at the bottom
You can either add the jars to the spark.jars
in the SparkConf or specify them at runtime
./bin/spark-submit \
--class <main-class> \
--master <master-url> \
--deploy-mode <deploy-mode> \
--conf <key>=<value> \
... # other options
<application-jar> \
so try
spark-submit --master yarn --jars the_jar_i_need.jar my_script.py
For example, I have a pyspark script kafak_consumer.py
that requires a jar, spark-streaming-kafka-0-8-assembly_2.11-2.1.1.jar
To run it the command is
spark-submit --master yarn --jars spark-streaming-kafka-0-8-assembly_2.11-2.1.1.jar kafka_consumer.py
Upvotes: 0