Basj
Basj

Reputation: 46493

Fastest way to read a whole HDF5 containing Numpy arrays into memory

I use :

import h5py

f = h5py.File('myfile.h5', 'r')
d = {}
for k in f.iterkeys():
    d[k] = f[k][:]

to read into memory the whole HDF5 file (2 GB, 1000 numpy arrays of 2 MB each).

Is there a faster way to load all the content of the HDF5 into memory ?

(Maybe the loop here does a lot of "move" (seek?) in the file because each f[k] are not placed in the order that will give for k in f.iterkeys() ?)

Upvotes: 5

Views: 4239

Answers (1)

Joe
Joe

Reputation: 6757

PyTables (another Python HDF5 Library) supports loading the whole file to memory using the H5FD_CORE driver. h5py would appear to support memory mapped files as well (see File Drivers). So just do

import h5py
f = h5py.File('myfile.h5', 'r', driver='core')

and you are done, as the file then already resides in memory.

Upvotes: 8

Related Questions