Reputation: 9018
I have the following file structure for a library I'm developing that exposes a keras model:
relevancy (repo)
relevancy (package repo)
data
model.h5
tokenizer.pickle
test
__init__.py
test_model.py
model.py
__init__.py
__init__.py
setup.py
The library basically loads the pre-trained tokenizer.pickle
and model.h5
and make predictions on input data.
Within model.py
, I have a function with the following code that loads the tokenizer and model:
def load()
with open("data/tokenizer.pickle", "rb") as f:
tokenizer = pickle.load(f)
model = keras.models.load_model("data/model.h5")
return tokenizer, model
In test_model.py
, I'm calling this function in my tests.
Then if I call python setup.py test
under /relevancy (repo)
, I will get error saying that data/tokenizer.pickle
is not found. Apparently, the relative is causing the problem.
How should I setup my directory or paths so that the tokenizer and model can always be loaded correctly?
Upvotes: 0
Views: 1242
Reputation: 311606
If you need to access data files stored inside your package, consider using the pkg_resources module.
Then in model.py
you can do something like this:
filename = pkg_resources.resource_filename(__name__, 'data/tokenizer.pickle')
with open(filename, 'rb') as f:
...
Upvotes: 2