Reputation: 5367
I have a large set of matlab
files data files which I need to access in Python
.
The files were saved using save
with -v6
or -v7
option, but not the -v7.3
.
I have to read only one single numerical value from each file, the files are many (100k+) and relatively large (1MB+). Therefore, I spend 99% of time idling in I/O operations which are useless.
I am looking for something like partial load, which is feasible for -v7.3 files using HDF5 library.
So far, I have bee using the scipy.io.loadmat
API.
Documentation says:
v4 (Level 1.0), v6 and v7 to 7.2 matfiles are supported.
You will need an HDF5 python library to read matlab 7.3 format mat files.
Because scipy does not supply one, we do not implement the HDF5 / 7.3 interface here.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html
But it looks like it does not allow partial load.
Does anyone have experience with implementing such a feature, or does anyone know how to parse these .mat files at a lower level?
I guess a fseek
-like approach could be possible when the structure is known
Upvotes: 2
Views: 1310
Reputation: 210972
Use variable_names
parameter if you want to read a single variable:
d = loadmat(filename, variable_names=['variable_name'])
then access it as follows:
d['variable_name']
UPDATE: if you need just a first element of an array/matrix you can do this:
val = loadmat(filename, variable_names=['var_name']).get('var_name')[0, 0]
NOTE: it will still read the whole variable into memory, but it will be deleted after first element is assigned to val
.
Upvotes: 3