SmCaterpillar
SmCaterpillar

Reputation: 7020

Python Unit Testing with PyTables and HDF5

What is a proper way to do Unit Testing with file IO, especially if it involves PyTables and HDF5?

My application evolves around storage and retrieval of python data into and from hdf5 files. So far I simply write the hdf5 files in the unit tests myself and load them for comparison. The problem is that I, of course, cannot be sure when some one else runs the test that he has privileges to actually write files to hard disk. (This probably gets even worse when I want to use automated test frameworks like Jenkins, but I haven't checked that, yet).

What is a proper way to handle these situations? Is it best practice to create a /tmp/ folder at a particular place where write access is very likely to be granted? If so, where is that? Or is there an easy and straight forward way to mock PyTables writing and reading?

Thanks a lot!

Upvotes: 2

Views: 1192

Answers (2)

Anthony Scopatz
Anthony Scopatz

Reputation: 3627

Fundmentally, HDF5 and Pytables are I/O libraries. They provide an API for file system manipulation. Therefore if you really want to test PyTables / HDF5 you have to hit the file system. There is no way around this. If a user does not have write access on a system, they cannot run the tests. Or at least they cannot run realistic tests.

You can use the in memory file driver to do testing. This is useful for speeding up most tests and testing higher level functionality. However, even if you go this route you should still have a few tests which actually write out real files. If these fail you know that something is wrong.

Normally, people create the temporary h5 files in the tests directory. But if you are truly worried about the user not having write access to this dir, you should use tempfile.gettempdir() to find their environment's correct /tmp dir. Note that this is cross-platform so should work everywhere. Put the h5 files that you create there and remember to delete them afterwards!

Upvotes: 1

cxrodgers
cxrodgers

Reputation: 4697

How about using the module "tempfile" to create the files?

http://docs.python.org/2/library/tempfile.html

I don't know if it's guaranteed to work on all platforms but I bet it does work on most common ones. It would certainly be better practice than hardcoding "/tmp" as the destination.

Another way would be to create an HDF5 database in memory so that no file I/O is required.

http://pytables.github.io/cookbook/inmemory_hdf5_files.html

I obtained that link by googling "hdf5 in memory" so I can't say for sure how well it works.

I think the best practice would be writing all test cases to run against both an in-memory database and a tempfile database. This way, even if one of the above techniques fails for the user, the rest of the tests will still run. Also you can separately identify whether bugs are related to file-writing or something internal to the database.

Upvotes: 1

Related Questions