Reputation: 665
I use Spark 1.6.
We have a HDFS write method that wrote to HDFS using SqlContext
. Now we needed to switch over to using HiveContext
. When we did that existing unit tests do not run and give the error
Error XSDB6: Another instance of Derby may have already booted the database <local path>\metastore_db
This happens whether I run a single test via IntelliJ test runner or via maven on the command line.
As I understand the issue happens when multiple HiveContexts or multiple processes are trying to access the metastore_db. However I am running a single test and no other jobs on my local machine so I fail to understand where the multiple processes are coming from
Upvotes: 4
Views: 2709
Reputation: 1005
Even I was getting the same error though I was running a test suite.
I could run the individual test file successfully but when I ran suite a few tests kept failing. There were many tests doing IO in local file system using SparkSession.
In that situation, use after
method in every test file(in my case, it was missing in 1-2 files) to close this session.
after {
sparkSession.stop()
}
Upvotes: 0
Reputation: 11
When HiveContext gets instantiated, it creates a metastore directory with the name of metastore_db in your test path. so deleting this directory after your test would allow you to create HiveContext again.
Java:
FileUtils.deleteDirectory(new Path(path of metastore_db));
Upvotes: 1
Reputation: 665
Figured out why I was getting an error. In the unit test we were writing data to ORC on the local file system and then reading to verify the write was done properly.
The write and read methods were creating their own HiveContexts in the same process which resulted in the lock on the metastore. I am guessing that when it was SqlContext it wasn't a blocker since a local metastore was not needed.
We have now moved to creating the HiveContext when we construct our persistence service. Semantically that makes more sense. This option was chosen over creating and destroying a new SparkContext (and thereby a new HiveContext) for every test since that would add considerable overhead to our test suite without providing much benefit (please do correct me if you have a different opinion)
Upvotes: 0