Reputation: 4025
I like to add attributes to pandas DataFrame columns, for example to manage labels or units.
df = pd.DataFrame([[1, 2], [5, 6]], columns=['A', 'B'])
df['A'].units = 'm/s'
Calling the units of column (with df['A'].units
) returns m/s
.
However, the attribute gets lost after any DataFrame to Series operation, such as adding a new column:
df['C'] = [3, 8]
df['A'].units
AttributeError: 'Series' object has no attribute 'units'
Is there an approach to keep the attributes or an alternative to add columns?
Upvotes: 0
Views: 180
Reputation: 76947
_metadata
, is not part of public API. Not a stable way of doing it, still, for now
In [8]: df = pd.DataFrame([[1, 2], [5, 6]], columns=['A', 'B'])
In [9]: df['A']._metadata
Out[9]: ['name']
In [10]: df['A']._metadata.append({'units': 'm/s'})
In [11]: df['C'] = [3, 8]
In [12]: df['A']._metadata
Out[12]: ['name', {'units': 'm/s'}]
Upvotes: 1