Reputation: 4584
I want to add a description to a column like the source of data from where I collected this. Is such thing is possible? There was a similar question asked about 8 years ago at Adding my own description attribute to a Pandas DataFrame with no answer.
My code:
df=
index colA colB
#description from SensorA SensorB # Description row
1
2
3
Upvotes: 4
Views: 4848
Reputation: 19300
A comment on pandas-dev/pandas#2485 suggests using _metadata
and .attrs
. See https://pandas.pydata.org/pandas-docs/stable/development/extending.html#define-original-properties for more information.
One way to do this is to subclass pandas.DataFrame
and add _metadata
.
Define
_metadata
for normal properties which will be passed to manipulation results.
import pandas as pd
class SubclassedDataFrame(pd.DataFrame):
# normal properties
_metadata = ['description']
@property
def _constructor(self):
return SubclassedDataFrame
data = {"a": [1, 2, 3], "b": [10, 12, 13]}
df = SubclassedDataFrame(data)
df.description = "About my data"
Setting _metadata
in the subclass indicates that these properties should be propagated after manipulation. See the example using .head()
below for a demonstration of the difference between pd.DataFrame
and this subclass.
data = {"a": [1, 2, 3], "b": [10, 12, 13]}
df = SubclassedDataFrame(data)
df.description = "About my data"
df.head().description # prints 'About my data'
df_orig = pd.DataFrame(data)
df_orig.description = "About my data"
df_orig.head().description # raises AttributeError
Upvotes: 5