
Reputation: 1048

spark test on local machine

I am running unit tests on Spark 1.3.1 with sbt test and besides the unit tests being incredibly slow I keep running into java.lang.ClassNotFoundException: org.apache.spark.storage.RDDBlockId issues. Usually this means a dependency issue, but I wouldn't know from where. Tried installing everything on a new machine, including fresh hadoop, fresh ivy2, but I still run into the same issue

Any help is greatly appreciated


Exception in thread "Driver Heartbeater" java.lang.ClassNotFoundException: 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:270)

My build.sbt:

libraryDependencies ++=  Seq( 
  "org.scalaz"              %% "scalaz-core" % "7.1.2" excludeAll ExclusionRule(organization = "org.slf4j"), 
  "com.typesafe.play"       %% "play-json" % "2.3.4" excludeAll ExclusionRule(organization = "org.slf4j"), 
  "org.apache.spark"        %% "spark-core" % "1.3.1" % "provided"  withSources() excludeAll (ExclusionRule(organization = "org.slf4j"), ExclusionRule("org.spark-project.akka", "akka-actor_2.10")), 
  "org.apache.spark"        %% "spark-graphx" % "1.3.1" % "provided" withSources() excludeAll (ExclusionRule(organization = "org.slf4j"), ExclusionRule("org.spark-project.akka", "akka-actor_2.10")), 
  "org.apache.cassandra"    % "cassandra-all" % "2.1.6", 
  "org.apache.cassandra"    % "cassandra-thrift" % "2.1.6", 
  "com.typesafe.akka" %% "akka-actor" % "2.3.11", 
  "com.datastax.cassandra"  % "cassandra-driver-core" % "2.1.6" withSources() withJavadoc() excludeAll (ExclusionRule(organization = "org.slf4j"),ExclusionRule(organization = "org.apache.spark"),ExclusionRule(organization = "com.twitter",name = "parquet-hadoop-bundle")), 
  "com.github.nscala-time"  %% "nscala-time" % "1.2.0" excludeAll ExclusionRule(organization = "org.slf4j") withSources(), 
  "com.datastax.spark"      %% "spark-cassandra-connector-embedded" % "1.3.0-M2" excludeAll (ExclusionRule(organization = "org.slf4j"),ExclusionRule(organization = "org.apache.spark"),ExclusionRule(organization = "com.twitter",name = "parquet-hadoop-bundle")), 
  "com.datastax.spark"      %% "spark-cassandra-connector" % "1.3.0-M2" excludeAll (ExclusionRule(organization = "org.slf4j"),ExclusionRule(organization = "org.apache.spark"),ExclusionRule(organization = "com.twitter",name = "parquet-hadoop-bundle")), 
  "org.slf4j"               % "slf4j-api"            % "1.6.1", 
   "com.twitter"            % "jsr166e" % "1.1.0", 
  "org.slf4j"               % "slf4j-nop" % "1.6.1" % "test", 
  "org.scalatest"           %% "scalatest" % "2.2.1" % "test" excludeAll ExclusionRule(organization = "org.slf4j") 

and my spark test settings (of which I have disabled all to test it)

(spark.app.name,Count all entries 217885402) 

An assembled or packaged jar sent to standalone or mesos works fine! Suggestions?

Upvotes: 3

Views: 1742

Answers (2)


Reputation: 199

We ran into the same issue in Spark 1.6.0 (there is already a bug report for it) We fixed it by switching to the Kryo serializer (which you should be using anyway). So it appears to be a bug in the default JavaSerializer.

Simply do the following to get rid of it:

new SparkConf().setAppName("Simple Application").set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")

Upvotes: 1


Reputation: 1048

The cause was a large broadcast variable. Unsure why (as it fit in memory), but removing it from the test cases made it work.

Upvotes: 0

Related Questions