Reputation: 432
I am trying to link a PostgreSQL database to a scala/spark project.
I wrote build.sbt
name := "Hermes"
version := "1.0"
scalaVersion := "2.10.6"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-sql" % "2.2.0",
"org.apache.spark" %% "spark-core" % "2.0.1",
"org.apache.spark" %% "spark-mllib" % "2.0.1",
"org.postgresql" % "postgresql" % "42.1.1"
)
And i have method of connection :
def getDatasetFromSql(query: String): Dataset[Row] = {
val options = Map(
"driver" -> "org.postgresql.Driver",
"url" -> createConnection,
"dbtable" -> query
)
val fromSqlDs: Dataset[Row] = spark.read.format("jdbc").options(options).load
fromSqlDs.cache.printSchema()
fromSqlDs
}
There are no exceptions throw when I tape sbt package, but when I spark-submit my code, I have this exception thrown java.lang.NoClassDefFoundError: org/postgresql/Driver
I have already checked some answers here, about the use of classOf[org.postgresql.driver] and SparkConf().setJars(). No success at this point.
How can I make this work ?
Upvotes: 1
Views: 1755
Reputation: 634
I faced similar issue once. So at the first try I have downloaded the postgres driver and saved it in a particular path. Then run the spark application like as follows :
sbt package
spark-submit --driver-class-path ~/jarDir/postgresql-9.3-1102-jdbc41.jar target/scala-2.10/simple-project_2.10-1.0.jar
As I was working with Ambari. So added the postgres driver directly to a custom parameter. So, next time no need to pass the postgres driver while running the command. Hope it helps.
Upvotes: 3