00__00__00
00__00__00

Reputation: 5367

partial load of matlab (.mat) files -v7 in python

I have a large set of matlab files data files which I need to access in Python. The files were saved using save with -v6 or -v7 option, but not the -v7.3.

I have to read only one single numerical value from each file, the files are many (100k+) and relatively large (1MB+). Therefore, I spend 99% of time idling in I/O operations which are useless.

I am looking for something like partial load, which is feasible for -v7.3 files using HDF5 library.

So far, I have bee using the scipy.io.loadmat API.

Documentation says:

v4 (Level 1.0), v6 and v7 to 7.2 matfiles are supported.
You will need an HDF5 python library to read matlab 7.3 format mat files. 
Because scipy does not supply one, we do not implement the HDF5 / 7.3 interface here.

https://docs.scipy.org/doc/scipy/reference/generated/scipy.io.loadmat.html

But it looks like it does not allow partial load.

Does anyone have experience with implementing such a feature, or does anyone know how to parse these .mat files at a lower level?

I guess a fseek-like approach could be possible when the structure is known

Upvotes: 2

Views: 1310

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210972

Use variable_names parameter if you want to read a single variable:

d = loadmat(filename, variable_names=['variable_name'])

then access it as follows:

d['variable_name']

UPDATE: if you need just a first element of an array/matrix you can do this:

val = loadmat(filename, variable_names=['var_name']).get('var_name')[0, 0]

NOTE: it will still read the whole variable into memory, but it will be deleted after first element is assigned to val.

Upvotes: 3

Related Questions