theDataNerd
theDataNerd

Reputation: 103

How to Mock Delta Tables in Spark or Create an In-Memory File System for Writing Delta Tables?

I'm working on a Spark project that involves using Delta Lake(Delta.io). During testing, I've encountered difficulties with mocking Delta tables or creating an in-memory file system where I can write Delta tables for testing purposes. I'm hoping using an in-memory file system will speed up the testing as we have hundreds of test.

I've tried using the pyfakefs library in Python to mock the file system, but it didn't work out well (Spark creates temp files in local filesystem, and then pyfakefs complains that the temp file is not available in the mockfs). Are there any alternative solutions or best practices for achieving this?

Specifically, I'm looking for a way to:

Mock Delta tables in Spark for unit/integration testing OR create an in-memory file system where I can write Delta tables during testing.

Thank you in advance for your help!

Upvotes: 0

Views: 424

Answers (0)

Related Questions