Reputation: 1865
So, I have a PySpark program that runs fine with the following command:
spark-submit --jars terajdbc4.jar,tdgssconfig.jar --master local sparkyness.py
And yes its running on local mode and just executing on the master node.
I want to be able to launch my PySpark script though with just:
python sparkyness.py
So, I have added the following lines of code throughtout my PySpark script to facilitate that:
import findspark
findspark.init()
sconf.setMaster("local")
sc._jsc.addJar('/absolute/path/to/tdgssconfig.jar')
sc._jsc.addJar('/absolute/path/to/terajdbc4.jar')
This does not seem to be working though. Everytime I try to run the script with python sparkyness.py
I get the error:
py4j.protocol.Py4JJavaError: An error occurred while calling o48.jdbc.
: java.lang.ClassNotFoundException: com.teradata.jdbc.TeraDriver
What is the difference between spark-submit --jars
and sc._jsc.addJar('myjar.jar')
and what could be causing this issue? Do I need to do more than just sc._jsc.addJar()
?
Upvotes: 2
Views: 2053
Reputation: 5782
Use spark.jars
when building the SparkSession
spark = SparkSession.builder.appName('my_awesome')\
.config('spark.jars', '/absolute/path/to/jar')\
.getOrCreate()
Related: Add Jar to standalone pyspark
Edit: I don't recommend hijacking the _jsc, because I don't think that handles distribution of jars to the driver and executors and adds to class path.
Example: I created a new SparkSession without the Hadoop AWS jar then tried to access S3 and here's the error (same error as when adding using sc._jsc.addJar
):
Py4JJavaError: An error occurred while calling o35.parquet. : java.io.IOException: No FileSystem for scheme: s3
Then I created a session with the jar and got a new, expected error:
Py4JJavaError: An error occurred while calling o390.parquet. : java.lang.IllegalArgumentException: AWS Access Key ID and Secret Access Key must be specified as the username or password (respectively) of a s3 URL, or by setting the fs.s3.awsAccessKeyId or fs.s3.awsSecretAccessKey properties (respectively).
Upvotes: 1