curioso
curioso

Reputation: 195

How to write .mat-v7.3 files using h5py?

I have been able to write .mat-v7.3 files with hdf5storage after reading the answers to Create .mat v7.3 files from Python using hdf5storage but I'm convinced I can achieve the same result by setting the right headers while writing my .mat file.

Assume that I have a Pandas dataframe built from the following data:

[
 {'time': 0, 'sig0': 0.6153857, 'sig1': 0.070254125, 'sig2': 0.025843188}, 
 {'time': 586576558, 'sig0': 0.6015989, 'sig1': 0.7131938, 'sig2': 0.42542282},
...
 {'time': 589999558, 'sig0': 0.1598977, 'sig1': 0.6131938, 'sig2': 0.88882282}
]

How can I parse this data and create a hdf5 file that is .mat-v7.3 compatible?

Upvotes: 1

Views: 248

Answers (1)

curioso
curioso

Reputation: 195

If you look at hdf5storage's GitHub repo, you will find the headers defined here, and it says you need a 512 bytes block of metadata containing things like datetime creation, platform version, character encoding and so on.

Then, all you need is to create dataset based on your dataframe columns. Here's an example combining these two steps:

import datetime
import h5py
import pandas as pd
import sys

data = [
    {'time': 0, 'sig0': 0.6153857, 'sig1': 0.070254125, 'sig2': 0.025843188},
    {'time': 586576558, 'sig0': 0.6015989, 'sig1': 0.7131938, 'sig2': 0.42542282},
    {'time': 589999558, 'sig0': 0.1598977, 'sig1': 0.6131938, 'sig2': 0.88882282}
]
df = pd.DataFrame(data)

def mat_export(df, export_path):
    def write_userblock(filename):
        now = datetime.datetime.now()
        v = sys.version_info
        platform_version = f"CPython {v.major}.{v.minor}.{v.micro}"
        created_on = now.strftime("%a %b %d %H:%M:%S %Y")
        header = f"MATLAB 7.3 MAT-file, Platform: {platform_version}, Created on: {created_on} HDF5 schema 1.00 ."
        header_bytes = bytearray(header, "utf-8")
        header_bytes.extend(bytearray((128 - 12 - len(header_bytes)) * " ", "utf-8"))
        header_bytes.extend(bytearray.fromhex("00000000 00000000 0002494D"))
        with open(filename, "r+b") as f:
            f.write(header_bytes)

    def write_h5py(data, filename, userblock_size=512):
        with h5py.File(filename, "w", userblock_size=userblock_size) as f:
            pass  # Close to write the userblock
        write_userblock(filename)
        with h5py.File(filename, "a") as f:
            for column in data.columns:
                f.create_dataset(column, data=data[column], maxshape=(None,), chunks=True)

    write_h5py(df, export_path)

export_path = 'your_data.mat'
mat_export(df, export_path)

That's it, I hope this Q&A helps people trying to export .mat compatible hdf5 files with Python.

Upvotes: 1

Related Questions