Shift rolling moving average on two columns grouped Dataframe

Question

I want to groupby two column with the same value in 'A' and 'B' and make a shift().rolling() based on which column is located that value.

df

   s  A  B  ValueA  ValueB
0  1  b  a       1       3
1  1  c  a       2       2
2  0  a  b       2       4
3  0  c  b       1       3
4  0  a  c       2       1
5  0  c  a       4       2
6  0  b  a       7       4
7  0  c  b       1       5

Now, if the value is in column 'A' consider ValueA to shift().rolling() if the value is in column 'B' consider ValueB to shift().rolling()

I tried this code but the result is not correct

Code

from numpy.core.numeric import NaN

df = pd.DataFrame({
    's' : [1,1,0,0,0,0,0,0],
    'A' : ['b','c','a','c','a','c','b','c'],
    'B': ['a', 'a', 'b', 'b','c','a','a','b'],
    'ValueA':[1,2,2,1,2,4,7,1],
    'ValueB':[3,2,4,3,1,2,4,5]
})
print(df)

df[['sumA','sumB']] = df.groupby(['A','B'])[['ValueA','ValueB']].shift().rolling(2, min_periods=2).sum()
print(df)

Expected Output


  ValueA    ValueB
0   NaN     NaN
1   NaN     NaN
2   NaN     NaN
3   NaN     NaN
4   NaN     NaN
5   2.0     4.0
6   7.0     4.0
7   5.0     10.0

woblob · Accepted Answer

I tried non loop version, but shift doesn't keep group devisions.

(Based on this answer)

output_df = pd.DataFrame()
for _, grp in df[['ValueA','ValueB']].stack().groupby(df[['A','B']].stack().tolist()):
    sub_df = grp.shift().rolling(2, min_periods=2).sum().unstack()
    output_df = output_df.combine_first(sub_df)
output_df

Outputs:

|    |   ValueA |   ValueB |
|---:|---------:|---------:|
|  0 |      nan |      nan |
|  1 |      nan |      nan |
|  2 |        5 |      nan |
|  3 |      nan |        5 |
|  4 |        4 |        3 |
|  5 |        2 |        4 |
|  6 |        7 |        4 |
|  7 |        5 |       10 |

Shift rolling moving average on two columns grouped Dataframe

Answers (1)

Related Questions