Reputation: 1307
I have an scala object file which internally queries mysql table does a join and writes data to s3, tested my code in local it runs perfectly fine. but when i submit it to cluster it throws below error:
Exception in thread "main" java.sql.SQLException: No suitable driver at java.sql.DriverManager.getDriver(DriverManager.java:315) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$2.apply(JdbcUtils.scala:54) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createConnectionFactory(JdbcUtils.scala:53) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:123) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.(JDBCRelation.scala:117) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:53) at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:149) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:122) at QuaterlyAudit$.main(QuaterlyAudit.scala:51) at QuaterlyAudit.main(QuaterlyAudit.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:736) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:185) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:210) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:124) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
below is my sparksubmit command:
nohup spark-submit --class QuaterlyAudit --master yarn-client --num-executors 8
--driver-memory 16g --executor-memory 20g --executor-cores 10 /mypath/campaign.jar &
i am using sbt, i included mysql connector in sbt assembly, Below is my build.sbt file:
name := "mobilewalla"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies ++= Seq("org.apache.spark" %% "spark-core" % "2.0.0" % "provided",
"org.apache.spark" %% "spark-sql" % "2.0.0" % "provided",
"org.apache.hadoop" % "hadoop-aws" % "2.6.0" intransitive(),
"mysql" % "mysql-connector-java" % "5.1.37")
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs@_*) =>
xs.map(_.toLowerCase) match {
case ("manifest.mf" :: Nil) |
("index.list" :: Nil) |
("dependencies" :: Nil) |
("license" :: Nil) |
("notice" :: Nil) => MergeStrategy.discard
case _ => MergeStrategy.first // was 'discard' previousely
}
case "reference.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
}
assemblyJarName in assembly := "campaign.jar"
i also tried with:
nohup spark-submit --driver-class-path /mypath/mysql-connector-java-5.1.37.jar
--class QuaterlyAudit --master yarn-client --num-executors 8 --driver-memory 16g
--executor-memory 20g --executor-cores 10 /mypath/campaign.jar &
but still no luck, what am i missing here.
Upvotes: 0
Views: 3401
Reputation: 1
In your spark submit command you need both params,
--jars argument will put jar in each cluster. --driver-class-path will tell application to actually use it.
nohup spark-submit --driver-class-path /mypath/mysql-connector-java-5.1.37.jar --jars /mypath/mysql-connector-java-5.1.37.jar --class QuaterlyAudit --master yarn-client --num-executors 8 --driver-memory 16g --executor-memory 20g --executor-cores 10 /mypath/campaign.jar &
Upvotes: 0
Reputation: 3547
You have to specify packages like this:
spark-submit --packages org.apache.spark:spark-avro_2.11:2.4.4,mysql:mysql-connector-java:5.1.6 your-jar.jar
Upvotes: 0
Reputation: 1483
It is obvious reason the Spark is not able to get the JDBC JAR. There are few work around by which it can be fixed. No doubt many people faced this issue . It is due to Jar is not getting uploaded to driver and executor.
spark-submit
cli.spark-submit
cli :
--jars $(echo ./lib/*.jar | tr ' ' ',')
spark.driver.extraClassPath
and spark.executor.extraClassPath
in SPARK_HOME/conf/spark-default.conf file and specify the value of these variables as the path of the jar file. Ensure that the same path exists on worker nodes.Upvotes: 0