Reputation: 1841
I'm trying to create a (very simple) pandas subclass, likeso:
import pandas as pd
data = pd.DataFrame({'A': [1, 2], 'B': [2, 3], 'C': [4, 5]})
class TestFrame(pd.DataFrame):
# See https://pandas.pydata.org/pandas-docs/stable/development/extending.html#extending-extension-types
_metadata = pd.DataFrame._metadata + ["addnl"]
@property
def _constructor(self):
return TestFrame
@property
def _constructor_sliced(self):
return pd.Series
@classmethod
def plus_one(
cls,
df,
):
tf = super().__new__(cls, df)
tf.addnl = 1
return tf
t1 = TestFrame.plus_one(data)
This proceeds without error, except that trying to view t1 gives me AttributeError: 'TestFrame' object has no attribute '_data'
.
I think this is because I am calling DataFrame.__new__
instead of __init__
, because it gives the same error for this:
object.__new__(pd.DataFrame, {'A': [1, 2], 'B': [2, 3], 'C': [4, 5]})
However, I can't then find a way to define the constructor. This is made more problematic by the fact that the pandas subclassing infrastructure doesn't yet (as far as I can tell) let you define an __init__
with new attributes.
Any help much appreciated.
Upvotes: 0
Views: 462
Reputation: 402323
The issue here is that the line tf = super().__new__(cls, df)
does not make sense. You are not overriding DataFrame.__init__
or __new__
so you don't have to use super()
to call them.
If the idea is to instantiate a frame of type TestFrame
, you can use tf = cls(df)
.
@classmethod
def plus_one(cls, df):
tf = cls(df)
tf.addnl = 1
return tf
Upvotes: 2