ArnoBen
ArnoBen

Reputation: 115

Converting python Dataframe to Matlab file

I am trying to convert a python Dataframe to a Matlab (.mat) file.

I initially have a txt (EEG signal) that I import using panda.read_csv:

MyDataFrame = pd.read_csv("data.txt",sep=';',decimal='.'), data.txt being a 2D array with labels. This creates a dataframe which looks like this.

In order to convert it to .mat, I tried this solution where the idea is to convert the dataframe into a dictionary of lists but after trying every aspect of this solution it's still unsuccessful.

scipy.io.savemat('EEG_data.mat', {'struct':MyDataFrame.to_dict("list")})

It did create a .mat file but it did not save my dataframe properly. The file I obtain after looks like this, so all the values are basically gone, and the remaining labels you see are empty when you look into them.

I also tried using mat4py which is designed to export python structures into Matlab files, but it did not work either. I don't understand why, because converting my dataframe to a dictionary of lists is exactly what should be done according to the mat4py documentation.

Upvotes: 2

Views: 9571

Answers (2)

nekomatic
nekomatic

Reputation: 6284

I believe that the reason the previous solutions haven't worked for you is that your DataFrame column names are not valid MATLAB struct field names, because they contain spaces and/or start with digit characters.

When I do:

import pandas as pd
import scipy.io
MyDataFrame = pd.read_csv('eeg.txt',sep=';',decimal='.')
truncDataFrame = MyDataFrame[0:1000] # reduce data size for test purposes
scipy.io.savemat('EEGdata1.mat', {'struct1':truncDataFrame.to_dict("list")})

the result in MATLAB is a struct with the 4 fields reltime, datetime, iSensor and quality. Each of these has 1000 elements, so the data from these columns has been converted, but the rest of your data is missing.

However if I first rename the DataFrame columns:

truncDataFrame.rename(columns=lambda x:'col_' + x.replace(' ', '_'), inplace=True)  
scipy.io.savemat('EEGdata2.mat', {'struct2':truncDataFrame.to_dict("list")})

the result in MATLAB is a struct with 36 fields. This is not the same format as your mat4py solution but it does contain (as far as I can see) all the data from the source DataFrame.

(Note that in your question, you are creating a .mat file that contains a variable called struct and when this is loaded into MATLAB it masks the builtin struct datatype - that might also cause issues with subsequent MATLAB code.)

Upvotes: 5

ArnoBen
ArnoBen

Reputation: 115

I finally found a solution thanks to this post. There, the poster did not create a dictionary of lists but a dictionary of integers, which worked on my side. It is a small example, easily reproductible. Then I tried to manually add lists by entering values like [1, 2], an it did not work. But what worked was when I manually added tuples !

MyDataFrame needs to be converted to a dictionary and if a dictionary of lists doesn't work, try with tuples.

For beginners : lists are contained by [] and tuples by (). Here is an image showing both.

This worked for me:

import mat4py as mp
EEGdata = MyDataFrame.apply(tuple).to_dict()
mp.savemat('EEGdata.mat',{'structs': EEGdata})

EEGdata.mat should now be readable by Matlab, as it is on my side.

Upvotes: 2

Related Questions