denutza
denutza

Reputation: 15

Pandas calculate and apply weighted rolling average on another column

I am having a hard time figuring out how to get "rolling weights" based off of one of my columns, then factor these weights onto another column.

I've tried groupby.rolling.apply (function) on my data but the main problem is just conceptualizing how I'm going to take a running/rolling average of the column I'm going to turn into weights, and then factor this "window" of weights onto another column that isn't rolled.

I'm also purposely setting min_period to 1, so you'll notice my first two rows in each group final output "rwag" mirror the original.

W is the rolling column to derive the weights from. B is the column to apply the rolled weights to. Grouping is only done on column a.

df is already sorted by a and yr.

def wavg(w,x):
    return (x * w).sum() / w.sum()

n=df.groupby(['a1'])[['w']].rolling(window=3,min_periods=1).apply(lambda x:  wavg(df['w'],df['b']))


Input:

id | yr  |   a  |   b    |   w
---------------------------------
0  | 1990 |  a1 |   50   |  3000   
1  | 1991 |  a1 |   40   |  2000   
2  | 1992 |  a1 |   10   |  1000   
3  | 1993 |  a1 |   20   |  8000         
4  | 1990 |  b1 |   10   |  500    
5  | 1991 |  b1 |   20   |  1000   
6  | 1992 |  b1 |   30   |  500    
7  | 1993 |  b1 |   40   |  4000        


Desired output:

id | yr  |   a  |   b  |   rwavg
---------------------------------
 0   1990   a1    50      50
 1   1991   a1    40      40
 2   1992   a1    10      39.96
 3   1993   a1    20      22.72
 4   1990   b1    10      10 
 5   1991   b1    20      20
 6   1992   b1    30      20
 7   1993   b1    40      35.45

Upvotes: 0

Views: 1667

Answers (1)

BENY
BENY

Reputation: 323226

apply with rolling usually have some wired behavior

df['Weight']=df.b*df.w
g=df.groupby(['a']).rolling(window=3,min_periods=1)
g['Weight'].sum()/g['w'].sum()
df['rwavg']=(g['Weight'].sum()/g['w'].sum()).values

Out[277]: 
a    
a1  0    50.000000
    1    46.000000
    2    40.000000
    3    22.727273
b1  4    10.000000
    5    16.666667
    6    20.000000
    7    35.454545
dtype: float64

Upvotes: 1

Related Questions