user5454504
user5454504

Reputation:

Apache pyspark using oracle jdbc to pull data. Driver cannot be found

I am using apache spark pyspark (spark-1.5.2-bin-hadoop2.6) on windows 7.

I keep getting this error when I run my python script in pyspark.

An error occured while calling o23.load. java.sql.SQLException: No suitable driver found for jdbc:oracle:thin:------------------------------------connection

Here is my python file

import os

os.environ["SPARK_HOME"] = "C:\\spark-1.5.2-bin-hadoop2.6"
os.environ["SPARK_CLASSPATH"] = "L:\\Pyspark_Snow\\ojdbc6.jar"

from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext

spark_config = SparkConf().setMaster("local[8]")  
sc = SparkContext(conf=spark_config) 
sqlContext = SQLContext(sc)

df = (sqlContext
    .load(source="jdbc",
          url="jdbc:oracle:thin://x.x.x.x/xdb?user=xxxxx&password=xxxx",
          dbtable="x.users")
 )
sc.stop()

Upvotes: 3

Views: 9704

Answers (3)

Nandeesh
Nandeesh

Reputation: 2802

To set the jars programatically set the following config: spark.yarn.dist.jars with comma-separated list of jars.

Eg:

from pyspark.sql import SparkSession

spark = SparkSession \
        .builder \
        .appName("Spark config example") \
        .config("spark.yarn.dist.jars", "<path-to-jar/test1.jar>,<path-to-jar/test2.jar>") \
        .getOrCreate()

Or as below:

from pyspark import SparkContext, SparkConf
from pyspark.sql import SQLContext

spark_config = SparkConf().setMaster("local[8]")
spark_config.set("spark.yarn.dist.jars", "L:\\Pyspark_Snow\\ojdbc6.jar")
sc = SparkContext(conf=spark_config) 
sqlContext = SQLContext(sc)

Or pass --jars with the path of jar files separated by , to spark-submit.

Upvotes: 1

Assaf Mendelson
Assaf Mendelson

Reputation: 12991

You can also add the jar using --jars and --driver-class-path and then set the driver specifically. See https://stackoverflow.com/a/36328672/1547734

Upvotes: 0

Nhor
Nhor

Reputation: 3940

Unfortunately changing environment variable SPARK_CLASSPATH won't work. You need to declare

spark.driver.extraClassPath L:\\Pyspark_Snow\\ojdbc6.jar

in your /path/to/spark/conf/spark-defaults.conf or simply execute spark-submit job with additional argument --jars:

spark-submit --jars "L:\\Pyspark_Snow\\ojdbc6.jar" yourscript.py

Upvotes: 4

Related Questions