Reputation: 11627

ExceptionInInitializerError in Scala unit test (Scalacheck, Scalatest)

I've written unit tests referring to DataframeGenerator example, which allows you to generate mock dataframes on the fly

After having executed the following commands successfully

sbt clean
sbt update
sbt compile

I get the errors shown in output upon running either of the following commands

sbt assembly
sbt test -- -oF

Output

...
[info] SearchClicksProcessorTest:
17/11/24 14:19:04 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
17/11/24 14:19:07 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect.
17/11/24 14:19:18 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
17/11/24 14:19:18 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
17/11/24 14:19:19 WARN ObjectStore: Failed to get database global_temp, returning NoSuchObjectException
[info] - testExplodeMap *** FAILED ***
[info]   ExceptionInInitializerError was thrown during property evaluation.
[info]     Message: "None"
[info]     Occurred when passed generated values (
[info]   
[info]     )
[info] - testFilterByClicks *** FAILED ***
[info]   NoClassDefFoundError was thrown during property evaluation.
[info]     Message: Could not initialize class org.apache.spark.rdd.RDDOperationScope$
[info]     Occurred when passed generated values (
[info]   
[info]     )
[info] - testGetClicksData *** FAILED ***
[info]   NoClassDefFoundError was thrown during property evaluation.
[info]     Message: Could not initialize class org.apache.spark.rdd.RDDOperationScope$
[info]     Occurred when passed generated values (
[info]   
[info]     )
...
[info] *** 3 TESTS FAILED ***
[error] Failed: Total 6, Failed 3, Errors 0, Passed 3
[error] Failed tests:
[error]         com.company.spark.ml.pipelines.search.SearchClicksProcessorTest
[error] (root/test:test) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 73 s, completed 24 Nov, 2017 2:19:28 PM

Things that I've tried unsuccessfully

Running sbt test with F flag to show full stacktrace (no stacktrace output appears as shown above)
Re-build the project in IntelliJ Idea

My questions are

What could be the possible cause of this error?
How can I enable the stack-trace output in SBT to be able to debug it?

EDIT-1 My unit-test class contains several methods like below

class SearchClicksProcessorTest extends FunSuite with Checkers {
  import spark.implicits._

  test("testGetClicksData") {
    val schemaIn = StructType(List(
      StructField("rank", IntegerType),
      StructField("city_id", IntegerType),
      StructField("target", IntegerType)
    ))
    val schemaOut = StructType(List(
      StructField("clicked_res_rank", IntegerType),
      StructField("city_id", IntegerType),
    ))
    val dataFrameGen = DataframeGenerator.arbitraryDataFrame(spark.sqlContext, schemaIn)

    val property = Prop.forAll(dataFrameGen.arbitrary) { dfIn: DataFrame =>
      dfIn.cache()
      val dfOut: DataFrame = dfIn.transform(SearchClicksProcessor.getClicksData)

      dfIn.schema === schemaIn &&
        dfOut.schema === schemaOut &&
        dfIn.filter($"target" === 1).count() === dfOut.count()
    }
    check(property)
  }
}

while build.sbt looks like this

// core settings
organization := "com.company"
scalaVersion := "2.11.11"

name := "repo-name"
version := "0.0.1"

// cache options
offline := false
updateOptions := updateOptions.value.withCachedResolution(true)

// aggregate options
aggregate in assembly := false
aggregate in update := false

// fork options
fork in Test := true

//common libraryDependencies
libraryDependencies ++= Seq(
  scalaTest,
  typesafeConfig,
  ...
  scalajHttp
)

libraryDependencies ++= allAwsDependencies
libraryDependencies ++= SparkDependencies.allSparkDependencies

assemblyMergeStrategy in assembly := {
  case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard
  ...
  case _ => MergeStrategy.first
}

lazy val module-1 = project in file("directory-1")

lazy val module-2 = (project in file("directory-2")).
  dependsOn(module-1).
  aggregate(module-1)

lazy val root = (project in file(".")).
  dependsOn(module-2).
  aggregate(module-2)

Upvotes: 2

Answers (2)

Hartmut Pfarr

Reputation: 6139

i have had a similar problem case, and after investigating I found out, that adding a lazy before a val solved my issue. My estimate is, running a Scala program with Scalatest invokes a little different initializing sequence. Where a normal scala execution initializes vals in an sourecode line numbers top-down order - having nested object {...} blocks initialized in the same way - using the same coding with Scalatest, the execution initializes the valss in nested object { ... } blocks before the vals line-number wise above the object { ... }.

This is absolutely vague I know but deferring initialization with prefixing vals with lazy could solve the test issue here.

The crucial thing here is that it doesn't occur in normal execution, only test execution, and in my case it was only occuring when using lambdas with taps in this form:

...
.tap(x =>
        hook_feld_erweiterungen_hook(
          abc = theProblematicVal
        )
      )
...

Upvotes: 1

y2k-shubham

Reputation: 11627

P.S. Please read comments on original question before reading this answer

Even the popular solution of overriding SBT's transitive dependency over faster-xml.jackson didn't work for me; in that some more changes were required (ExceptionInInitializerError was gone but some other error cropped up)
Finally (in addition to above mentioned fix) I ended up creating DataFrames in a different way (as opposed to StructType used here). I created them as

spark.sparkContext.parallelize(Seq(MyType)).toDF()

where MyType is a case class as per the schema of DataFrame
While implementing this solution, I encountered a small problem that while datatypes of schema generated from case class were correct, the nullability of fields was often mismatching; fix for this issue was found here

Here I'm blatantly admitting that I'm not sure what was the correct fix: faster-xml.jackson dependency or the alternate way of creating DataFrame, so please feel free to fill the lapses in understanding / investigating the issue

Upvotes: 0

ExceptionInInitializerError in Scala unit test (Scalacheck, Scalatest)

Answers (2)

Related Questions