user3685285
user3685285

Reputation: 6586

SparkSession doesn't shutdown properly between unit tests

I have a few unit tests that need to have their own sparkSession. I extended SQLTestUtils, and am overriding the beforeAll and afterAll functions that are used in many other Spark Unit tests (from the source). I have a few test suites that look something like this:

class MyTestSuite extends QueryTest with SQLTestUtils {

    protected var spark: SparkSession = null

    override def beforeAll(): Unit = {
        super.beforeAll()
        spark = // initialize sparkSession...
    }

    override def afterAll(): Unit = {
        try {
            spark.stop()
            spark = null
        } finally {
            super.afterAll()
        }
    }

    // ... my tests ...

}

If I run one of these, it's fine, but if I run two or more, I get this error:

Caused by: ERROR XSDB6: Another instance of Derby may have already booted the database /home/jenkins/workspace/Query/apache-spark/sql/hive-thriftserver-cat-server/metastore_db.

But I thought that the afterAll() was supposed to properly shut spark down so that I could create a new one. Is this not right? How do I accomplish this?

Upvotes: 2

Views: 630

Answers (1)

Denis Makarenko
Denis Makarenko

Reputation: 2938

One way to do it this is to disable parallel test execution for your Spark app project to make sure only one instance of Spark Session object is active at the time. In sbt syntax it would like this:

  project.in(file("your_spark_app"))
    .settings(parallelExecution in Test := false)

The downside is that this is a per project setting and it would also affect the tests that would benefit from parallelization. A workaround would be to create a separate project for Spark tests.

Upvotes: 1

Related Questions