Yilun Zhang
Yilun Zhang

Reputation: 9018

Python unittest: directory and file paths

I have the following file structure for a library I'm developing that exposes a keras model:

relevancy (repo)
    relevancy (package repo)
        data
            model.h5
            tokenizer.pickle
        test
            __init__.py
            test_model.py
        model.py
        __init__.py
    __init__.py
    setup.py

The library basically loads the pre-trained tokenizer.pickle and model.h5 and make predictions on input data.

Within model.py, I have a function with the following code that loads the tokenizer and model:

def load()
    with open("data/tokenizer.pickle", "rb") as f:
        tokenizer = pickle.load(f)
    model = keras.models.load_model("data/model.h5")
    return tokenizer, model

In test_model.py, I'm calling this function in my tests.

Then if I call python setup.py test under /relevancy (repo), I will get error saying that data/tokenizer.pickle is not found. Apparently, the relative is causing the problem.

How should I setup my directory or paths so that the tokenizer and model can always be loaded correctly?

Upvotes: 0

Views: 1242

Answers (1)

larsks
larsks

Reputation: 311606

If you need to access data files stored inside your package, consider using the pkg_resources module.

Then in model.py you can do something like this:

filename = pkg_resources.resource_filename(__name__, 'data/tokenizer.pickle')
with open(filename, 'rb') as f:
    ...

Upvotes: 2

Related Questions