Reputation: 3771
I am writing a custom accessor class for a Pandas Dataframe - I have followed the examples here and achieved positive results. However I have a function where I would like to pass additional arguments.
I have created this function within my accessor class:
@property
def accessor_function(self, time_window=0.5):
def group_function(df, time):
fl = df.loc[df.Type_num==0]
id = fl.Time.idxmin()
threshold = df.loc[id, 'column'] + time
return fl.loc[fl.Time<threshold]
self.Subset = self._obj.groupby(by['col_1','col_2']).apply(group_function, time_window)
self.Subset.reset_index(drop=True, inplace=True)
return self.Subset
If I call this like this it works using time_window=0.5
:
df.accessor.accessor_function
However if I want to pass a different value for the keyword argument:
df.accessor.accessor_function(time_window = 1)
I get an error:
TypeError: 'DataFrame' object is not callable
I can't find any obvious documentation explaining passing args
or kwargs
to custom accessors. So I'm not sure if what I'm attempting is even possible. But it would be good to understand how to move forward.
Ben
Upvotes: 3
Views: 831
Reputation: 841
I believe it has to do with the fact that you are using the porperty
decorator when actually you have a method. If you remove that, it should work fine, see example below:
import pandas as pd
@pd.api.extensions.register_dataframe_accessor("accessor")
class MyAccessor:
def __init__(self, pandas_obj):
self._obj = pandas_obj
def accessor_function(self, time_window=0.5):
def group_function(df, time):
fl = df.loc[df.Type_num==0]
id = fl.Time.idxmin()
threshold = df.loc[id, 'column'] + time
return fl.loc[fl.Time<threshold]
self.Subset = self._obj.groupby(['col_1','col_2']).apply(group_function, time_window)
self.Subset.reset_index(drop=True, inplace=True)
return self.Subset
The default case is:
>>> a = pd.DataFrame({'Type_num': [False, False,False,False,False],
'Time': [1, 2, 0.1, 0.2, 0.5],
'col_1': ['A', 'B', 'C', 'D', 'E'],
'col_2': ['A', 'A', 'C', 'E', 'E'],
'column': [0.2, 0.2,0.2, 0.2,0.2]})
>>> a.accessor.accessor_function()
Type_num Time col_1 col_2 column
0 False 0.1 C C 0.2
1 False 0.2 D E 0.2
2 False 0.5 E E 0.2
You can use a custom time_window
>>> a.accessor.accessor_function(time_window=1)
Type_num Time col_1 col_2 column
0 False 1.0 A A 0.2
1 False 0.1 C C 0.2
2 False 0.2 D E 0.2
3 False 0.5 E E 0.2
Or pass that parameter using *arg
or **kwargs
:
>>> a.accessor.accessor_function(*[2])
Type_num Time col_1 col_2 column
0 False 1.0 A A 0.2
1 False 2.0 B A 0.2
2 False 0.1 C C 0.2
3 False 0.2 D E 0.2
4 False 0.5 E E 0.2
>>> a.accessor.accessor_function(**{'time_window':0.1})
Type_num Time col_1 col_2 column
0 False 0.1 C C 0.2
1 False 0.2 D E 0.2
Upvotes: 5