Reputation: 695
I want to add to pandas a custom function without touching the pandas package. I tried to following:
extra_pandas.py
:
from pandas import *
class DataFrame2(pandas.core.frame.DataFrame):
def new_function(self):
print("I exist")
pandas.core.frame.DataFrame = DataFrame2
my_script.py
:
import extra_pandas as pd
df = pd.read_csv('example.csv')
print(df.new_function())
It appears not to work and I can't figure out what is wrong. I get the following error:
AttributeError: 'DataFrame' object has no attribute 'new_function'
What am I missing?
Thank you very much
Update: I gave the Alternative solution a try and wanted to patch all pandas function in a loop using this snippet:
patch_function = [read_csv, read_json, read_html, read_clipboard, read_excel,
read_hdf, read_feather, read_parquet, read_msgpack,
read_stata, read_sas, read_pickle, read_sql, read_gbq]
for func in patch_function:
orig_func = func
def patch(*args, **kwargs):
return DataFrame(orig_func(*args, **kwargs))
func = patch
But this is not working. Any idea why?
Thanks
Upvotes: 1
Views: 193
Reputation: 85442
You cannot patch but you cen replace:
extra_pandas.py
:
from pandas import *
class DataFrame2(DataFrame):
def new_function(self):
print("I exist")
DataFrame = DataFrame2
my_script.py
:
import extra_pandas as pd
df = pd.DataFrame(pd.read_csv('furniture.csv'))
print(df.new_function())
Output:
I exist
Just import you own class:
extra_pandas.py
:
import pandas as pd
class DataFrame2(pd.DataFrame):
def new_function(self):
print("I exist")
my_script.py
:
import pandas as pd
from extra_pandas import DataFrame2
df = DataFrame2(pd.read_csv('example.csv'))
print(df.new_function())
Output:
I exist
DataFrame
takes another dataframe as input for making a new dataframe.
You try to monkey patch the DataFrame
class. This does not work. This is likely due to the fact that is largely written in Cython, hence compiled to a C extension. This interferes with your attempt to monkey patch.
Or monkey patch read_csv()
.
extra_pandas.py
:
import pandas as pd
class DataFrame2(pd.DataFrame):
def new_function(self):
print("I exist")
orig_read_csv=pd.read_csv
def my_read_csv(*args, **kwargs):
return DataFrame2(orig_read_csv(*args, **kwargs))
pd.read_csv = my_read_csv
my_script.py
:
import pandas as pd
import extra_pandas
df = pd.read_csv('furniture.csv')
print(df.new_function())
Output:
I exist
Upvotes: 4