Reputation: 1506
I want to write spark unit test cases and I am using FunSuite for it. But i want that my sparkContext is initialized only once , used by all the Suites and then is killed when all Suites completes.
abstract class baseClass extends FunSuite with BeforeAndAfter{
before {
println("initialize spark context")
}
after {
println("kill spark context")
}
}
@RunWith(classOf[JUnitRunner])
class A extends baseClass{
test("for class A"){
//assert
}
@RunWith(classOf[JUnitRunner])
class B extends baseClass{
test(for class b){
//assert
}
}
but when i run sbt test I can see println statement baseClass has been called from both the tests. Obsiously When the object is created for both the classes A and B , Abstract baseclass is called. But then how can we achieve my purpose i.e spark context is iniliazed only once while all the test cases are run
Upvotes: 3
Views: 3966
Reputation: 1590
I strongly recommend using the spark-testing-base
library in order to manage the lifecycle of a sparkContext or sparkSession during your tests.
You won't have to pollute your tests by overriding the beforeAll
, afterAll
methods and managing the lifecycle of the sparkSession
/sparkContext
.
You can share one sparkSession
/sparkContext
for all the tests by overriding the following method :
def reuseContextIfPossible: Boolean = true
for more details : https://github.com/holdenk/spark-testing-base/wiki/SharedSparkContext
I hope it helps!
Upvotes: 2
Reputation: 37852
If you really want to share the context between suites - you'll have to make it static. Then you can use a lazy
value to make it start on first use. As for shutting it down - you can leave it to the automatic Shutdown hook created each time a context is created.
It would look something like:
abstract class SparkSuiteBase extends FunSuite {
lazy val sparkContext = SparkSuiteBase.sparkContext
}
// putting the Spark Context inside an object allows reusing it between tests
object SparkSuiteBase {
private lazy val sparkContext = ??? // create the context here
}
Upvotes: 0
Reputation: 37852
Option 1: Use the excellent https://github.com/holdenk/spark-testing-base library that does exactly that (and provides many other nice treats). After following the readme, it's as simle as mixing-in SharedSparkContext
instead of your baseClass
, and you'll have an sc: SparkContext
value ready to use in your test
Option 2: to do it yourself, you'd want to mix-in BeforeAndAfterAll
and not BeforeAndAfter
, and implement beforeAll
and afterAll
, which is exactly what the above-mentioned SharedSparkContext
does.
Upvotes: 1