Mainland
Mainland

Reputation: 4584

Python Dataframe Adding a description to column

I want to add a description to a column like the source of data from where I collected this. Is such thing is possible? There was a similar question asked about 8 years ago at Adding my own description attribute to a Pandas DataFrame with no answer.

My code:

df=
     
index            colA          colB
#description     from SensorA   SensorB   # Description row 
1
2
3

Upvotes: 4

Views: 4848

Answers (1)

jkr
jkr

Reputation: 19300

A comment on pandas-dev/pandas#2485 suggests using _metadata and .attrs. See https://pandas.pydata.org/pandas-docs/stable/development/extending.html#define-original-properties for more information.

One way to do this is to subclass pandas.DataFrame and add _metadata.

Define _metadata for normal properties which will be passed to manipulation results.

import pandas as pd

class SubclassedDataFrame(pd.DataFrame):

    # normal properties
    _metadata = ['description']

    @property
    def _constructor(self):
        return SubclassedDataFrame

data = {"a": [1, 2, 3], "b": [10, 12, 13]}

df = SubclassedDataFrame(data)

df.description = "About my data"

Setting _metadata in the subclass indicates that these properties should be propagated after manipulation. See the example using .head() below for a demonstration of the difference between pd.DataFrame and this subclass.

data = {"a": [1, 2, 3], "b": [10, 12, 13]}

df = SubclassedDataFrame(data)
df.description = "About my data"
df.head().description  # prints 'About my data'

df_orig = pd.DataFrame(data)
df_orig.description = "About my data"
df_orig.head().description  # raises AttributeError

Upvotes: 5

Related Questions