Bart
Bart

Reputation: 101

spark submit java.lang.ClassNotFoundException

I'm trying to run my own spark application but when I'm using the spark-submit command I get this error:

Users/_name_here/dev/sp/target/scala-2.10/sp_2.10-0.1-SNAPSHOT.jar --stacktrace
java.lang.ClassNotFoundException:        /Users/_name_here/dev/sp/mo/src/main/scala/MySimpleApp
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:340)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:633)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I'm using following command:

/Users/_name_here/dev/spark/bin/spark-submit 
--class "/Users/_name_here/dev/sp/mo/src/main/scala/MySimpleApp" 
--master local[4] /Users/_name_here/dev/sp/target/scala-2.10/sp_2.10-0.1-SNAPSHOT.jar 

My build.sb looks like this:

name := "mo"

version := "1.0"

scalaVersion := "2.10.4"


libraryDependencies ++= Seq(
  "org.apache.spark"          % "spark-core_2.10"   %    "1.4.0",
  "org.postgresql"            % "postgresql"        %    "9.4-1201-jdbc41",
  "org.apache.spark"          % "spark-sql_2.10"    %    "1.4.0",
  "org.apache.spark"          % "spark-mllib_2.10"  %    "1.4.0",
  "org.tachyonproject"        % "tachyon-client"    %    "0.6.4",
  "org.postgresql"            % "postgresql"        %    "9.4-1201-jdbc41",
  "org.apache.spark"          % "spark-hive_2.10"   %    "1.4.0",
  "com.typesafe"              % "config"            %    "1.2.1"
)

resolvers += "Typesafe Repo" at "http://repo.typesafe.com/typesafe/releases/"

My plugin.sbt:

logLevel := Level.Warn

resolvers += "Sonatype snapshots" at "https://oss.sonatype.org/content/repositories/snapshots/"

addSbtPlugin("com.github.mpeltonen" % "sbt-idea" % "1.6.0")
addSbtPlugin("com.eed3si9n" % "sbt-assembly"  %"0.11.2")

I'm using the prebuild package from spark.apache.org. I installed sbt through brew as well as scala. Running sbt package from the spark root folder works fine and it creates the jar but using assembly doesn't work at all, maybe because its missing in the rebuild spark folder. I would appreciate any help because I'm quite new to spark. oh and btw spark is running fine within intelliJ

Upvotes: 10

Views: 17961

Answers (3)

SharpLu
SharpLu

Reputation: 1214

The problem is because you are input the incorrect --class arguments

  1. If you are used Java maven project make sure you have input the correct class path --class "/Users/_name_here/dev/sp/mo/src/main/scala/MySimpleApp" it should likes com.example.myclass this format. Aldo could be --class myclass

  2. Here have a lot of example about Spark submit.

    Run application locally on 8 cores

    ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master local[8] \ /path/to/examples.jar \ 100

    Run on a Spark standalone cluster in client deploy mode

    ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000

    Run on a Spark standalone cluster in cluster deploy mode with supervise

    ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master spark://207.184.161.138:7077 \ --deploy-mode cluster \ --supervise \ --executor-memory 20G \ --total-executor-cores 100 \ /path/to/examples.jar \ 1000

    Run on a YARN cluster

    export HADOOP_CONF_DIR=XXX ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master yarn \ --deploy-mode cluster \ # can be client for client mode --executor-memory 20G \ --num-executors 50 \ /path/to/examples.jar \ 1000

    Run a Python application on a Spark standalone cluster

    ./bin/spark-submit \ --master spark://207.184.161.138:7077 \ examples/src/main/python/pi.py \ 1000

    Run on a Mesos cluster in cluster deploy mode with supervise

    ./bin/spark-submit \ --class org.apache.spark.examples.SparkPi \ --master mesos://207.184.161.138:7077 \ --deploy-mode cluster \ --supervise \ --executor-memory 20G \ --total-executor-cores 100 \ http://path/to/examples.jar \ 1000

Upvotes: 0

Bart
Bart

Reputation: 101

Apparently there must have been something wrong with my project structure in general. Because I created a new project with sbt and sublime and I'm now able to use spark-submit. But this is really weird because I haven't changed anything to the default structure of a sbt-project provided in intelliJ. This is now the project structure which works like a charm:

Macbook:sp user$ find .
.
./build.sbt
./project
./project/plugin.sbt
./src
./src/main
./src/main/scala
./src/main/scala/MySimpleApp.scala

Thanks for your help!

Upvotes: 0

TheMP
TheMP

Reputation: 8427

You should not refer to your class by its directory path, but by its package path. Example:

/Users/_name_here/dev/spark/bin/spark-submit 
--master local[4]
--class com.example.MySimpleApp /Users/_name_here/dev/sp/target/scala-2.10/sp_2.10-0.1-SNAPSHOT.jar

From what I see you do not have MySimpleApp in any package, so just "--class MySimpleApp" should work.

Upvotes: 8

Related Questions