Reputation: 294258
According to https://pandas.pydata.org/pandas-docs/stable/internals.html
I should be able to sublcass a pandas Series
My MCVE is
from pandas import Series
class Xseries(Series):
_metadata = ['attr']
@property
def _constructor(self):
return Xseries
def __init__(self, *args, **kwargs):
self.attr = kwargs.pop('attr', 0)
super().__init__(*args, **kwargs)
s = Xseries([1, 2, 3], attr=3)
Notice that the attr
attribute is:
s.attr
3
However, when I multiply by 2
(s * 2).attr
0
Which is the default. Therefore, the attr
was not passed on. You may ask, maybe that isn't the intended behavior? I think it is according to the documentation https://pandas.pydata.org/pandas-docs/stable/internals.html#define-original-properties
And if we use the mul
method, it seems to work
s.mul(2).attr
3
And this doesn't (which is the same as s * 2
)
s.__mul__(2).attr
0
I wanted to put this passed SO before I created an issue on github. Is this a bug?
Is there a workaround?
I need to be able to do s * 2
and have the attr
attribute passed on to the result.
Upvotes: 4
Views: 81
Reputation: 294258
I will delete this answer if @chrisb posts a similar one.
As posted by @chrisb here, this is an open issue.
Matthiasha posted a workaround that is recreated below using my example from the question.
from pandas import Series
class Xseries(Series):
_metadata = ['attr']
@property
def _constructor(self):
def _c(*args, **kwargs):
# workaround for https://github.com/pandas-dev/pandas/issues/13208
return Xseries(*args, **kwargs).__finalize__(self)
return _c
def __init__(self, *args, **kwargs):
self.attr = kwargs.pop('attr', 0)
super().__init__(*args, **kwargs)
And now the problem is solved:
(Xseries([1, 2, 3], attr=3) * 2).attr
3
Upvotes: 0
Reputation: 20214
If you use inspect.getsourcelines
to check the source code of these two functions mul
and __mul__
, you will find they actually have different implementations.
And using s.mul(2).attr
still doesn't work as it just uses __finalize__
to propagate all attributes but not really multiply it.
Or maybe I am misunderstanding your question and you just want to propagate but not multiply attr
as well?
If yes, you can modify your custom __mul__
function to call __finalize__
.
from pandas import Series
class Xseries(Series):
_metadata = ['attr']
@property
def _constructor(self):
return Xseries
def __init__(self, *args, **kwargs):
self.attr = kwargs.pop('attr', 0)
super().__init__(*args, **kwargs)
def __mul__(self, other):
internal_result = super().__mul__(other)
return internal_result.__finalize__(self)
s = Xseries([1, 2, 3], attr=3)
If not, you can manually multiply attr
and return.
from pandas import Series
class Xseries(Series):
_metadata = ['attr']
@property
def _constructor(self):
return Xseries
def __init__(self, *args, **kwargs):
self.attr = kwargs.pop('attr', 0)
super().__init__(*args, **kwargs)
def __mul__(self, other):
internal_result = super().__mul__(other)
if hasattr(other, "attr"):
internal_result.attr = self.attr * other.attr
else:
internal_result.attr = self.attr * other
return internal_result
s = Xseries([1, 2, 3], attr=3)
Upvotes: 2