kav
kav

Reputation: 697

Unexpected __rsub__ output when using dataframe

I am trying to subtract two dataframes.

The logic I have for __sub__ behaves as expected, but the logic for __rsub__ does not output the reverse.

How can I get the __rsub__ output to match?

import pandas as pd

class Base:
    def __init__(self):
        self.data = pd.DataFrame({'a': [5, 6]})

    def __sub__(self, other):
        return self.data - other

    def __rsub__(self, other):
        return other - self.data

base = Base()
df = pd.DataFrame({'a': [1, 2]})
# Unexpected output

>>> df - base
                a
0     a
0 -4
1 -5
1     a
0 -3
1 -4
# Expected output

>>> base - df
   a
0  4
1  4

Upvotes: 1

Views: 60

Answers (1)

ojdo
ojdo

Reputation: 8900

Short answer: your implementation of __rsub__ is never run. You can step through the execution in a debugger to see for yourself (e.g. ipython: %debug base - df and then step through the execution).

The only way to override pandas' logic here would be to make Base a subclass of pd.DataFrame, as you can learn from a helpful note in Python's documentation on __rsub__ (my emphasis):

Note: If the right operand’s type is a subclass of the left operand’s type and that subclass provides a different implementation of the reflected method for the operation, this method will be called before the left operand’s non-reflected method. This behavior allows subclasses to override their ancestors’ operations

If you really really need to subclass pandas data structures, learn from a good example like GeoPandas' GeoDataFrame.

But I would rather follow pandas's many recommendations against subclassing and rather use composition, except if I wanted to accomplish something similar to what GeoDataFrame did for DataFrame.

Upvotes: 1

Related Questions