Reputation: 93
When trying to simplify unit testing with Spark and Scala, I am using scala-test and mockito-scala (and mockito sugar). This simply lets you do something like this:
val sparkSessionMock = mock[SparkSession]
Then you can usually do all the magic with "when" and "verify".
But if you have some implementations that has the necessary import of
import spark.implicits._
in its code, then the simplicity of unit testing seems to be gone (or at least I didn't find the most proper way to solve this, yet).
I end up in getting this error:
org.mockito.exceptions.verification.SmartNullPointerException:
You have a NullPointerException here:
-> at ...
because this method call was *not* stubbed correctly:
-> at scala.Option.orElse(Option.scala:289)
sparkSession.implicits();
Simply mocking the call on the "implicits" object inside SparkSession won't help due to typing issues:
val implicitsMock = mock[SQLImplicits]
when(sparkSessionMock.implicits).thenReturn(implicitsMock)
will not let you pass, since it says it will require the type of the object inside your mock:
require: sparkSessionMock.implicits.type
found: implicitsMock.type
And please don't tell me that I should rather do SparkSession.builder.getOrCreate()... since then this isn't a unit-test anymore but a more heavy weight integration test.
(Edit): here is a complete reproducible example:
import org.apache.spark.sql._
import org.mockito.Mockito.when
import org.scalatest.{ FlatSpec, Matchers }
import org.scalatestplus.mockito.MockitoSugar
case class MyData(key: String, value: String)
class ClassToTest()(implicit spark: SparkSession) {
import spark.implicits._
def read(path: String): Dataset[MyData] =
spark.read.parquet(path).as[MyData]
}
class SparkMock extends FlatSpec with Matchers with MockitoSugar {
it should "be able to mock spark.implicits" in {
implicit val sparkMock: SparkSession = mock[SparkSession]
val implicitsMock = mock[SQLImplicits]
when(sparkMock.implicits).thenReturn(implicitsMock)
val readerMock = mock[DataFrameReader]
when(sparkMock.read).thenReturn(readerMock)
val dataFrameMock = mock[DataFrame]
when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
val dataSetMock = mock[Dataset[MyData]]
implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)
new ClassToTest().read("/some/path/") shouldBe dataSetMock
}
}
Upvotes: 4
Views: 4718
Reputation: 51658
You can't mock implicits. Implicits are resolved at compile time while mocking occurs at runtime (runtime reflection, bytecode manipulation via Byte Buddy). You can't import at compile time implicits that will be mocked only at runtime. You'll have to resolve implicits manually (in principle you can resolve implicits at runtime if you launch compiler once again at runtime but this would be much harder 1 2 3 4).
Try
class ClassToTest()(implicit spark: SparkSession, encoder: Encoder[MyData]) {
def read(path: String): Dataset[MyData] =
spark.read.parquet(path).as[MyData]
}
class SparkMock extends AnyFlatSpec with Matchers with MockitoSugar {
it should "be able to mock spark.implicits" in {
implicit val sparkMock: SparkSession = mock[SparkSession]
val readerMock = mock[DataFrameReader]
when(sparkMock.read).thenReturn(readerMock)
val dataFrameMock = mock[DataFrame]
when(readerMock.parquet("/some/path")).thenReturn(dataFrameMock)
val dataSetMock = mock[Dataset[MyData]]
implicit val testEncoder: Encoder[MyData] = Encoders.product[MyData]
when(dataFrameMock.as[MyData]).thenReturn(dataSetMock)
new ClassToTest().read("/some/path") shouldBe dataSetMock
}
}
//[info] SparkMock:
//[info] - should be able to mock spark.implicits
//[info] Run completed in 2 seconds, 727 milliseconds.
//[info] Total number of tests run: 1
//[info] Suites: completed 1, aborted 0
//[info] Tests: succeeded 1, failed 0, canceled 0, ignored 0, pending 0
//[info] All tests passed.
Please notice that "/some/path"
should be the same in both places. In your code snippet two strings were different.
Upvotes: 2