ely
ely

Reputation: 77464

Adding my own description attribute to a Pandas DataFrame

I am retrieving some web data, parsing it, and storing the output as a Pandas DataFrame into an HDF5 file. Right before I write the DataFrame into the H5 file, I add my own description string to annotate some metadata about where the data came from and whether anything went wrong while parsing it.

In [1]: my_data_frame.desc = "Some string about the data"

In [2]: my_data_frame.desc

Out[1]: "Some string about the data"

In [3]: print type(my_data_frame)
<class 'pandas.core.frame.DataFrame'>

However, after loading the same data with pandas.io.pytables.HDFStore(), my added desc attribute is missing and I get the error: AttributeError: 'DataFrame' object has no attribute 'desc' as if I had never added this new attribute.

How can I get my metadata descriptions to persist as an extra attribute of the DataFrame object? (Or is there some existing, recognized attribute of a DataFrame that I can hijack for my metadata purposes?)

Upvotes: 9

Views: 3320

Answers (1)

Wes McKinney
Wes McKinney

Reputation: 105591

Adding DataFrame metadata or per-column metadata is on the roadmap but hasn't been implemented yet. I'm open to ideas about what the API should look like, though.

Upvotes: 6

Related Questions