Reputation: 151
I have read a Matlab file containing a large amount of arrays as a dataset into Python storing the Matlab Dictionary under the variable name mat
using the command:
mat = loadmat('Sample Matlab Extract.mat')
Is there a way I can then use Python's write to csv functionality to save this Matlab dictionary variable I read into Python as a comma separated file?
with open('mycsvfile.csv','wb') as f:
w = csv.writer(f)
w.writerows(mat.items())
f.close()
creates a CSV file with one column containing array names within the dictionary and then another column containing the first element of each corresponding array. Is there a way to utilize a command similar to this to obtain all corresponding elements within the arrays inside of the 'mat' dictionary variable?
Upvotes: 15
Views: 39810
Reputation: 1703
The function scipy.io.loadmat
generates a dictionary looking something like this:
{'__globals__': [],
'__header__': 'MATLAB 5.0 MAT-file, Platform: MACI, Created on: Wed Sep 24 16:11:51 2014',
'__version__': '1.0',
'a': array([[1, 2, 3]], dtype=uint8),
'b': array([[4, 5, 6]], dtype=uint8)}
It sounds like what you want to do is make a .csv file with the keys "a", "b", etc. as the column names and their corresponding arrays as data associated with each column. If so, I would recommend using pandas
to make a nicely formatted dataset that can be exported to a .csv file. First, you need to clean out the commentary members of your dictionary (all the keys beginning with "__"). Then, you want to turn each item value in your dictionary into a pandas.Series
object. The dictionary can then be turned into a pandas.DataFrame
object, which can also be saved as a .csv file. Your code would look like this:
import scipy.io
import pandas as pd
mat = scipy.io.loadmat('matex.mat')
mat = {k:v for k, v in mat.items() if k[0] != '_'}
data = pd.DataFrame({k: pd.Series(v[0]) for k, v in mat.items()}) # compatible for both python 2.x and python 3.x
data.to_csv("example.csv")
Upvotes: 15
Reputation: 485
reading a matfile (.MAT) with the below code data = scipy.io.loadmat(files[0])
gives a dictionary of values and keys
and " 'header', 'version', 'globals'" these are some of the default values which we need to remove
cols=[]
for i in data:
if '__' not in i :
cols.append(i)
temp_df=pd.DataFrame(columns=cols)
for i in data:
if '__' not in i :
temp_df[i]=(data[i]).ravel()
we remove the unwanted header values using "if '__' not in i:" and then make a dataframe using the rest of the headers and finally assign the column values to respective column headers
Upvotes: 0
Reputation: 199
import scipy.io
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
class MatDataToCSV():
def init(self):
pass
def convert_mat_tocsv(self):
mat = scipy.io.loadmat('wiki.mat')
instances = mat['wiki'][0][0][0].shape[1]
columns = ["dob", "photo_taken", "full_path", "gender",\
"name", "face_location", "face_score", "second_face_score"]
df = pd.DataFrame(index = range(0,instances), columns = columns)
for i in mat:
if i == "wiki":
current_array = mat[i][0][0]
for j in range(len(current_array)):
df[columns[j]] = pd.DataFrame(current_array[j][0])
return df
Upvotes: 1
Reputation: 124
This is correct solution for converting any .mat file into .csv file. Try it
import scipy.io
import numpy as np
data = scipy.io.loadmat("file.mat")
for i in data:
if '__' not in i and 'readme' not in i:
np.savetxt(("file.csv"),data[i],delimiter=',')
Upvotes: 3