abdoulsn
abdoulsn

Reputation: 1159

Convert mat file to pandas dataframe problem

Hello I'm stuck on getting good conversion of a matrix of matlab to pandas dataframe. I converted it but I've got one row in which I've list of list. These list of list are normaly my rows.

import pandas as pd
import numpy as np
from scipy.io.matlab import mio
Data_mat = mio.loadmat('senet50-ferplus-logits.mat')

my Data_mat.keys() gives me this output:

dict_keys(['__header__', '__version__', '__globals__', 'images', 'wavLogits'])

I'd like to convert images and wavLogits to data frame. By looking to this post I applied the solution.

cardio_df = pd.DataFrame(np.hstack((Data_mat['images'], Data_mat['wavLogits'])))

And the output is df

How to get the df in good format?

[UPDATE] Data_mat["images"] has

array([[(array([[array(['A.J._Buckley/test/Y8hIVOBuels_0000001.wav'], dtype='<U41'),
        array(['A.J._Buckley/test/Y8hIVOBuels_0000002.wav'], dtype='<U41'),
        array(['A.J._Buckley/test/Y8hIVOBuels_0000003.wav'], dtype='<U41'),
        ...,
        array(['Zulay_Henao/train/s4R4hvqrhFw_0000007.wav'], dtype='<U41'),
        array(['Zulay_Henao/train/s4R4hvqrhFw_0000008.wav'], dtype='<U41'),
        array(['Zulay_Henao/train/s4R4hvqrhFw_0000009.wav'], dtype='<U41')]],
      dtype=object), array([[     1,      2,      3, ..., 153484, 153485, 153486]], dtype=int32), array([[   1,    1,    1, ..., 1251, 1251, 1251]], dtype=uint16), array([[array(['Y8hIVOBuels'], dtype='<U11'),
        array(['Y8hIVOBuels'], dtype='<U11'),
        array(['Y8hIVOBuels'], dtype='<U11'), ...,
        array(['s4R4hvqrhFw'], dtype='<U11'),
        array(['s4R4hvqrhFw'], dtype='<U11'),
        array(['s4R4hvqrhFw'], dtype='<U11')]], dtype=object), array([[1, 2, 3, ..., 7, 8, 9]], dtype=uint8), array([[array(['A.J._Buckley/1.6/Y8hIVOBuels/1/01.jpg'], dtype='<U37')],
       [array(['A.J._Buckley/1.6/Y8hIVOBuels/1/02.jpg'], dtype='<U37')],
       [array(['A.J._Buckley/1.6/Y8hIVOBuels/1/03.jpg'], dtype='<U37')],
       ...,
       [array(['Zulay_Henao/1.6/s4R4hvqrhFw/9/16.jpg'], dtype='<U36')],
       [array(['Zulay_Henao/1.6/s4R4hvqrhFw/9/17.jpg'], dtype='<U36')],
       [array(['Zulay_Henao/1.6/s4R4hvqrhFw/9/18.jpg'], dtype='<U36')]],
      dtype=object), array([[1.00000e+00],
       [1.00000e+00],
       [1.00000e+00],
       ...,
       [1.53486e+05],
       [1.53486e+05],
       [1.53486e+05]], dtype=float32), array([[3, 3, 3, ..., 1, 1, 1]], dtype=uint8))]],
      dtype=[('name', 'O'), ('id', 'O'), ('sp', 'O'), ('video', 'O'), ('track', 'O'), ('denseFrames', 'O'), ('denseFramesWavIds', 'O'), ('set', 'O')])

Upvotes: 2

Views: 3701

Answers (1)

Rainb
Rainb

Reputation: 2465

So this is what I'd do to convert a mat file into a pandas dataframe automagically.

mat = scipy.io.loadmat('file.mat')
mat = {k:v for k, v in mat.items() if k[0] != '_'}
df = pd.DataFrame({k: np.array(v).flatten() for k, v in mat.items()})

Upvotes: 2

Related Questions