Ankit Agrahari
Ankit Agrahari

Reputation: 379

Spark.jars not adding jars to classpath

I am trying to add my custom jar in spark job using "spark.jars" property. Although I can read the info in logs of jar getting added but when I check the jars that are added to the classpath, I don't find it.Below are the functions that I also have tried it out.

1)spark.jars

2)spark.driver.extraLibraryPath

3)spark.executor.extraLibraryPath

4)setJars(Seq[String])

But none added the jar.I am using spark 2.2.0 in HDP and files were kept locally. Please let me know what possibly I am doing wrong.

First option worked for me.Spark.jars was adding jar as it was being shown in Spark UI.

Upvotes: 1

Views: 6229

Answers (2)

ssedano
ssedano

Reputation: 8432

If you need an external jar available to the executors, you can try spark.executor.extraClassPath. According to the documentation it shouldn't be necessary, but it helped me in the past

Extra classpath entries to prepend to the classpath of executors. This exists primarily for backwards-compatibility with older versions of Spark. Users typically should not need to set this option.

Documentation: https://spark.apache.org/docs/latest/configuration.html

Upvotes: 1

Steven Black
Steven Black

Reputation: 2232

Check the documentation for submitting jobs, adding extra non-runtime jars is at the bottom

You can either add the jars to the spark.jars in the SparkConf or specify them at runtime

./bin/spark-submit \
  --class <main-class> \
  --master <master-url> \
  --deploy-mode <deploy-mode> \
  --conf <key>=<value> \
  ... # other options
  <application-jar> \

so try spark-submit --master yarn --jars the_jar_i_need.jar my_script.py

For example, I have a pyspark script kafak_consumer.py that requires a jar, spark-streaming-kafka-0-8-assembly_2.11-2.1.1.jar

To run it the command is

spark-submit --master yarn --jars spark-streaming-kafka-0-8-assembly_2.11-2.1.1.jar kafka_consumer.py

Upvotes: 0

Related Questions