doom4
doom4

Reputation: 695

Override module method to extend functionality then import the updated module from another file

I want to add to pandas a custom function without touching the pandas package. I tried to following:

extra_pandas.py:

from pandas import *
class DataFrame2(pandas.core.frame.DataFrame):
    def new_function(self):
        print("I exist")
pandas.core.frame.DataFrame = DataFrame2

my_script.py:

import extra_pandas as pd
df = pd.read_csv('example.csv')
print(df.new_function())

It appears not to work and I can't figure out what is wrong. I get the following error:

AttributeError: 'DataFrame' object has no attribute 'new_function'

What am I missing?

Thank you very much

Update: I gave the Alternative solution a try and wanted to patch all pandas function in a loop using this snippet:

patch_function = [read_csv, read_json, read_html, read_clipboard, read_excel,
                  read_hdf, read_feather, read_parquet, read_msgpack,
                  read_stata, read_sas, read_pickle, read_sql, read_gbq]

for func in patch_function:
    orig_func = func

    def patch(*args, **kwargs):
        return DataFrame(orig_func(*args, **kwargs))

    func = patch

But this is not working. Any idea why?

Thanks

Upvotes: 1

Views: 193

Answers (1)

Mike Müller
Mike Müller

Reputation: 85442

Solution 1

You cannot patch but you cen replace:

extra_pandas.py:

from pandas import *

class DataFrame2(DataFrame):
    def new_function(self):
        print("I exist")

DataFrame = DataFrame2

my_script.py:

import extra_pandas as pd

df = pd.DataFrame(pd.read_csv('furniture.csv'))
print(df.new_function())

Output:

I exist

Solution 2

Just import you own class:

extra_pandas.py:

import pandas as pd

class DataFrame2(pd.DataFrame):
    def new_function(self):
        print("I exist")

my_script.py:

import pandas as pd
from extra_pandas import DataFrame2

df = DataFrame2(pd.read_csv('example.csv'))
print(df.new_function())

Output:

I exist

DataFrame takes another dataframe as input for making a new dataframe.

Explanation

You try to monkey patch the DataFrame class. This does not work. This is likely due to the fact that is largely written in Cython, hence compiled to a C extension. This interferes with your attempt to monkey patch.

Alternative

Or monkey patch read_csv().

extra_pandas.py:

import pandas as pd

class DataFrame2(pd.DataFrame):
    def new_function(self):
        print("I exist")


orig_read_csv=pd.read_csv

def my_read_csv(*args, **kwargs):
    return DataFrame2(orig_read_csv(*args, **kwargs))

pd.read_csv = my_read_csv

my_script.py:

import pandas as pd

import extra_pandas

df = pd.read_csv('furniture.csv')
print(df.new_function())

Output:

I exist

Upvotes: 4

Related Questions