fatema sh
fatema sh

Reputation: 11

How to import .mat -v7.3 file in python using h5py BUT with same ordering of dimension?

I have several .mat files and each of them including a Matrix. I need to import them in python using h5py, because they have been save by -v7.3.
For example:

*myfile.mat  includes matrix X with the size of (10, 20)*

I use following commands in python:

*import numpy np,h5py
f=h5py.File('myfile.mat','r')
data=np.array(f['X'])
data.shape*    ->    **(20, 10)  Here is the problem!**

The matrix X is transposed. How can I import the X without being transposed?

Upvotes: 1

Views: 1259

Answers (1)

hpaulj
hpaulj

Reputation: 231665

I think you have to live with transposing. MATLAB if F ordered, numpy C ordered (by default). Somewhere along the line loadmat does that transposing. h5py does not, so you have to do some sort of transposing or reordering.

And by the way, transpose is one of the cheapest operations on a numpy array.

save a (2,3) array in Octave

octave:27> x=[0,1,2;3,4,5]
octave:28> save 'x34_7.mat' '-7' x
octave:33> save 'x34_h5.mat' '-hdf5' x
octave:32> reshape(x,[1,6])
ans =   0   3   1   4   2   5

load it. The shape is (2,3), but if F ordered:

In [102]: x7=loadmat('x34_7.mat')

In [103]: x7['x']
Out[103]: 
array([[ 0.,  1.,  2.],
       [ 3.,  4.,  5.]])

In [104]: _.flags
Out[104]: 
  C_CONTIGUOUS : False
  F_CONTIGUOUS : True
  ...

Look at the h5 version:

In [110]: f=h5py.File('x34_h5.mat','r')

In [111]: x5=f['x']['value'][:]
Out[111]: 
array([[ 0.,  3.],
       [ 1.,  4.],
       [ 2.,  5.]])
# C_contiguous

and the data in x5 buffer is in the same order as in Octave:

In [134]: np.frombuffer(x5.data, float)
Out[134]: array([ 0.,  3.,  1.,  4.,  2.,  5.])

so is the data from loadmat (though I have to transpose to look at it with frombuffer (to be Ccontiguous)

In [139]: np.frombuffer(x7.T.data,float)
Out[139]: array([ 0.,  3.,  1.,  4.,  2.,  5.])

(Is there a better way of varifying that x5.data and x7.data has the same content?)


This pattern holds with higher dimensions. In MATLAB it's the 1st dimension that varies most rapidly. Loaded by h5py, that dimension corresponds to the last. So a x(:,2,2,2) would correspond to a x[1,1,1,:], and a x.T[:,1,1,1].

Upvotes: 1

Related Questions