Reputation: 529
I want to groupby
two column with the same value in 'A' and 'B' and make a shift().rolling()
based on which column is located that value.
df
s A B ValueA ValueB
0 1 b a 1 3
1 1 c a 2 2
2 0 a b 2 4
3 0 c b 1 3
4 0 a c 2 1
5 0 c a 4 2
6 0 b a 7 4
7 0 c b 1 5
Now, if the value is in column 'A' consider ValueA to shift().rolling()
if the value is in column 'B' consider ValueB to shift().rolling()
I tried this code but the result is not correct
Code
from numpy.core.numeric import NaN
df = pd.DataFrame({
's' : [1,1,0,0,0,0,0,0],
'A' : ['b','c','a','c','a','c','b','c'],
'B': ['a', 'a', 'b', 'b','c','a','a','b'],
'ValueA':[1,2,2,1,2,4,7,1],
'ValueB':[3,2,4,3,1,2,4,5]
})
print(df)
df[['sumA','sumB']] = df.groupby(['A','B'])[['ValueA','ValueB']].shift().rolling(2, min_periods=2).sum()
print(df)
Expected Output
ValueA ValueB
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
4 NaN NaN
5 2.0 4.0
6 7.0 4.0
7 5.0 10.0
Upvotes: 0
Views: 139
Reputation: 1377
I tried non loop version, but shift
doesn't keep group devisions.
(Based on this answer)
output_df = pd.DataFrame()
for _, grp in df[['ValueA','ValueB']].stack().groupby(df[['A','B']].stack().tolist()):
sub_df = grp.shift().rolling(2, min_periods=2).sum().unstack()
output_df = output_df.combine_first(sub_df)
output_df
Outputs:
| | ValueA | ValueB |
|---:|---------:|---------:|
| 0 | nan | nan |
| 1 | nan | nan |
| 2 | 5 | nan |
| 3 | nan | 5 |
| 4 | 4 | 3 |
| 5 | 2 | 4 |
| 6 | 7 | 4 |
| 7 | 5 | 10 |
Upvotes: 2