Reputation: 115
I am trying to convert a python Dataframe to a Matlab (.mat) file.
I initially have a txt (EEG signal) that I import using panda.read_csv:
MyDataFrame = pd.read_csv("data.txt",sep=';',decimal='.')
, data.txt being a 2D array with labels. This creates a dataframe which looks like this.
In order to convert it to .mat, I tried this solution where the idea is to convert the dataframe into a dictionary of lists but after trying every aspect of this solution it's still unsuccessful.
scipy.io.savemat('EEG_data.mat', {'struct':MyDataFrame.to_dict("list")})
It did create a .mat file but it did not save my dataframe properly. The file I obtain after looks like this, so all the values are basically gone, and the remaining labels you see are empty when you look into them.
I also tried using mat4py which is designed to export python structures into Matlab files, but it did not work either. I don't understand why, because converting my dataframe to a dictionary of lists is exactly what should be done according to the mat4py documentation.
Upvotes: 2
Views: 9571
Reputation: 6284
I believe that the reason the previous solutions haven't worked for you is that your DataFrame
column names are not valid MATLAB struct
field names, because they contain spaces and/or start with digit characters.
When I do:
import pandas as pd
import scipy.io
MyDataFrame = pd.read_csv('eeg.txt',sep=';',decimal='.')
truncDataFrame = MyDataFrame[0:1000] # reduce data size for test purposes
scipy.io.savemat('EEGdata1.mat', {'struct1':truncDataFrame.to_dict("list")})
the result in MATLAB is a struct with the 4 fields reltime
, datetime
, iSensor
and quality
. Each of these has 1000 elements, so the data from these columns has been converted, but the rest of your data is missing.
However if I first rename the DataFrame columns:
truncDataFrame.rename(columns=lambda x:'col_' + x.replace(' ', '_'), inplace=True)
scipy.io.savemat('EEGdata2.mat', {'struct2':truncDataFrame.to_dict("list")})
the result in MATLAB is a struct with 36 fields. This is not the same format as your mat4py solution but it does contain (as far as I can see) all the data from the source DataFrame.
(Note that in your question, you are creating a .mat
file that contains a variable called struct
and when this is loaded into MATLAB it masks the builtin struct
datatype - that might also cause issues with subsequent MATLAB code.)
Upvotes: 5
Reputation: 115
I finally found a solution thanks to this post. There, the poster did not create a dictionary of lists but a dictionary of integers, which worked on my side. It is a small example, easily reproductible. Then I tried to manually add lists by entering values like [1, 2], an it did not work. But what worked was when I manually added tuples !
MyDataFrame needs to be converted to a dictionary and if a dictionary of lists doesn't work, try with tuples.
For beginners : lists are contained by [] and tuples by (). Here is an image showing both.
This worked for me:
import mat4py as mp
EEGdata = MyDataFrame.apply(tuple).to_dict()
mp.savemat('EEGdata.mat',{'structs': EEGdata})
EEGdata.mat should now be readable by Matlab, as it is on my side.
Upvotes: 2