I am using Spark 3 with Scala 2.12.3. My application has some dependencies which I want to include in the Fat jar file. I see one option to build using the sbt-assembly
on this link. In order to do this I have to create an project/assembly.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")
and my build.sbt file has:
name := "explore-spark"
version := "0.2"
scalaVersion := "2.12.3"
val sparkVersion = "3.0.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"com.twitter" %% "algebird-core" % "0.13.7",
"joda-time" % "joda-time" % "2.5",
"org.fusesource.mqtt-client" % "mqtt-client" % "1.16"
mainClass in(Compile, packageBin) := Some("")
mainClass in assembly := Some("")
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyJarName in assembly := s"${name.value}_${scalaBinaryVersion.value}-fat_${version.value}.jar"
Then execute the command sbt assembly
on the root directory of the project. I get warning messages saying that files are been discarded.
[info] Merging files...
[warn] Merging 'META-INF/NOTICE.txt' with strategy 'rename'
[warn] Merging 'META-INF/LICENSE.txt' with strategy 'rename'
[warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[warn] Merging 'META-INF/maven/com.googlecode.javaewah/JavaEWAH/' with strategy 'discard'
[warn] Merging 'META-INF/maven/com.googlecode.javaewah/JavaEWAH/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/joda-time/joda-time/' with strategy 'discard'
[warn] Merging 'META-INF/maven/joda-time/joda-time/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtbuf/hawtbuf/' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtbuf/hawtbuf/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch-transport/' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch-transport/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch/' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.mqtt-client/mqtt-client/' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.mqtt-client/mqtt-client/pom.xml' with strategy 'discard'
[warn] Strategy 'discard' was applied to 13 files
[warn] Strategy 'rename' was applied to 2 files
[info] SHA-1: 2f2a311b8c826caae5f65a3670a71aafa12e2dc7
[info] Packaging /home/felipe/workspace-idea/explore-spark/target/scala-2.12/explore-spark_2.12-fat_0.2.jar ...
[info] Done packaging.
[success] Total time: 13 s, completed Jul 20, 2020 12:44:37 PM
Then when I try to submit my spark application I get the error java.lang.NoClassDefFoundError: org/fusesource/hawtbuf/Buffer
. I created the fat jar file but somehow it is discarding the dependencies that I need. This is how I submit the application just to make sure that I am using the fat jar.
$ ./bin/spark-submit --master spark:// --deploy-mode cluster --driver-cores 4 --name "App" --conf "spark.driver.extraJavaOptions=-javaagent:/home/flink/spark-3.0.0-bin-hadoop2.7/jars/jmx_prometheus_javaagent-0.13.0.jar=8082:/home/flink/spark-3.0.0-bin-hadoop2.7/conf/spark.yml" /home/felipe/workspace-idea/explore-spark/target/scala-2.12/explore-spark_2.12-fat_0.2.jar -app 2
You can debug in the following order:
