Kyle
Kyle

Reputation: 2545

Apply function with multiple arguments to rolling DataFrame Pandas

I have a Pandas DataFrame similar to the below. Please not this does not need to be in this format, it can be put back into pivoted fashion and have each variable be a column if that's easier.

      date      variable  value
0   2014-01-31    item1  0.070898
1   2014-02-28    item1  0.064725
2   2014-03-31    item1  0.101292
3   2014-04-30    item1  0.041898
4   2014-05-31    item1  0.094894
5   2014-06-30    item1  0.110287
6   2014-07-31    item1  0.081844
7   2014-08-31    item1  0.043819
8   2014-09-30    item1  0.009196
9   2014-10-31    item1  0.076482
10  2014-11-30    item1  0.055906
11  2014-12-31    item1  0.042028
12  2015-01-31    item1  0.115469
13  2015-02-28    item1  0.130457
14  2015-03-31    item1  0.054314
15  2015-04-30    item1  0.141286
16  2015-05-31    item1  0.109387
17  2015-06-30    item1  0.066154
18  2015-07-31    item1  0.039390
19  2015-08-31    item1  0.071276
20  2015-09-30    item1  0.088784
21  2015-10-31    item1  0.054234
22  2015-11-30    item1  0.111854
23  2015-12-31    item1  0.005053
24  2016-01-31    item1  0.145953
25  2016-02-29    item1  0.138051
26  2016-03-31    item1  0.030395
27  2016-04-30    item1  0.055843
28  2016-05-31    item1  0.037960
29  2016-06-30    item1  0.147318
..         ...      ...       ...
120 2015-09-30      ref  0.043185
121 2015-10-31      ref  0.046849
122 2015-11-30      ref  0.008139
123 2015-12-31      ref  0.011222
124 2016-01-31      ref  0.026408
125 2016-02-29      ref  0.040404
126 2016-03-31      ref  0.039585
127 2016-04-30      ref  0.028782
128 2016-05-31      ref  0.002799
129 2016-06-30      ref  0.040413
130 2016-07-31      ref  0.004451
131 2016-08-31      ref  0.001946
132 2016-09-30      ref  0.029548
133 2016-10-31      ref  0.029416
134 2016-11-30      ref  0.024893
135 2016-12-31      ref  0.037656
136 2017-01-31      ref  0.045642
137 2017-02-28      ref  0.046690
138 2017-03-31      ref  0.019193
139 2017-04-30      ref  0.016362
140 2017-05-31      ref  0.025512
141 2017-06-30      ref  0.021625
142 2017-07-31      ref  0.026472
143 2017-08-31      ref  0.016531
144 2017-09-30      ref  0.047447
145 2017-10-31      ref  0.032180
146 2017-11-30      ref  0.048635
147 2017-12-31      ref  0.040450
148 2018-01-31      ref  0.047735
149 2018-02-28      ref  0.013727          

For each item in variable I would like to apply a rolling 12 month calculation that accepts as arguments the rolling 12 months of value for that item, and the rolling 12 months of values where variable equals ref. The function is of the form:

def func(series1,series2):
   do something with series
   return single_value_from_doiong_something

My issue is how to pass two different series into this function to return my single value. Does anyone know how to do this?

Upvotes: 0

Views: 96

Answers (1)

rahlf23
rahlf23

Reputation: 9019

Regarding the function you mention, after you restructure your original dataframe with a join(), this may be closer to what you are after:

combined = df[df['variable']!='ref'].set_index('date').join(df[df['variable']=='ref'].set_index('date'), lsuffix='', rsuffix='_ref').drop('variable_ref', axis=1)

def func(series, ref_series):

    #As an example
    return series.mean()/ref_series.mean()

combined.groupby('variable').rolling(12).apply(lambda x: func(x, combined.loc[x.index]['value_ref']), raw=False).drop('value_ref', axis=1)

This example yields the following (NaN are present due to the gaps in your sample data):

                        value
variable date                
item1    2014-01-31       NaN
         2014-02-28       NaN
         2014-03-31       NaN
         2014-04-30       NaN
         2014-05-31       NaN
         2014-06-30       NaN
         2014-07-31       NaN
         2014-08-31       NaN
         2014-09-30       NaN
         2014-10-31       NaN
         2014-11-30       NaN
         2014-12-31       NaN
         2015-01-31       NaN
         2015-02-28       NaN
         2015-03-31       NaN
         2015-04-30       NaN
         2015-05-31       NaN
         2015-06-30       NaN
         2015-07-31       NaN
         2015-08-31       NaN
         2015-09-30  1.912186
         2015-10-31  1.793184
         2015-11-30  2.609254
         2015-12-31  3.009455
         2016-01-31  3.123833
         2016-02-29  2.910599
         2016-03-31  2.708132
         2016-04-30  2.497878
         2016-05-31  2.561760
         2016-06-30  2.681712

Upvotes: 1

Related Questions