Reputation: 8386
I have a hdf5 file that contains about 10 databases that I need across my project in various places (different modules).
At the moment I use a simple function that will give me the database that I want:
def get_hdf5_dataframe(dataframe_name: str) -> pd.DataFrame:
db = pd.HDFStore("/database.h5")
df = db[dataframe_name]
db.close() # needs to be closed every time I access it
return df
However, this is not efficient as the program will have to load the file every time.
If I use lru_cache
decorator then the program will load the file 10 times for each database.
What will be an efficient way to get the databases by loading the file only once and make sure I close the hdf5 file after reading it.
Upvotes: 0
Views: 185
Reputation: 2246
You could store the opened file as a global:
db = None
def get_hdf5_dataframe(dataframe_name: str) -> pd.DataFrame:
global db
if db is None:
db = pd.HDFStore("/database.h5")
df = db[dataframe_name]
return df
This will only open it once on first access (although the file will stay open for the life of your program). Use globals with caution though- they can make life difficult if overused.
Upvotes: 1