Reputation: 697
I am trying to subtract two dataframes.
The logic I have for __sub__
behaves as expected, but the logic for __rsub__
does not output the reverse.
How can I get the __rsub__
output to match?
import pandas as pd
class Base:
def __init__(self):
self.data = pd.DataFrame({'a': [5, 6]})
def __sub__(self, other):
return self.data - other
def __rsub__(self, other):
return other - self.data
base = Base()
df = pd.DataFrame({'a': [1, 2]})
# Unexpected output
>>> df - base
a
0 a
0 -4
1 -5
1 a
0 -3
1 -4
# Expected output
>>> base - df
a
0 4
1 4
Upvotes: 1
Views: 60
Reputation: 8900
Short answer: your implementation of __rsub__
is never run. You can step through the execution in a debugger to see for yourself (e.g. ipython: %debug base - df
and then s
tep through the execution).
The only way to override pandas' logic here would be to make Base
a subclass of pd.DataFrame
, as you can learn from a helpful note in Python's documentation on __rsub__
(my emphasis):
Note: If the right operand’s type is a subclass of the left operand’s type and that subclass provides a different implementation of the reflected method for the operation, this method will be called before the left operand’s non-reflected method. This behavior allows subclasses to override their ancestors’ operations
If you really really need to subclass pandas data structures, learn from a good example like GeoPandas' GeoDataFrame.
But I would rather follow pandas's many recommendations against subclassing and rather use composition, except if I wanted to accomplish something similar to what GeoDataFrame did for DataFrame.
Upvotes: 1