Reputation: 77
I tried using rolling(4).sum().shift(-3)
but I run into an issue where it keeps adding because I don't have a condition where it stops to check if the movement has changed. I tried groupby but it throws an error as well. Any advice?
movement | value |
---|---|
right | 2 |
right | 1 |
right | 3 |
right | 1 |
right | 1 |
right | 1 |
right | 1 |
right | 1 |
Left | 5 |
Left | 4 |
Left | 2 |
Left | 1 |
Left | 1 |
Left | 1 |
Left | 1 |
Left | 1 |
What I want to get is the following:
movement | value | rolling value |
---|---|---|
right | 2 | 7 |
right | 1 | 6 |
right | 3 | 6 |
right | 1 | 4 |
right | 1 | 4 |
right | 1 | nan |
right | 1 | nan |
right | 1 | nan |
Left | 5 | 12 |
Left | 4 | 8 |
Left | 2 | 5 |
Left | 1 | 4 |
Left | 1 | 4 |
Left | 1 | nan |
Left | 1 | nan |
Left | 1 | nan |
Upvotes: 1
Views: 405
Reputation: 35636
We can use a FixedForwardWindowIndexer
with an offset of -3 as the window
instead of shifting after the fact, and droplevel
to remove the additional index in movement, but keep the index alignment of the DataFrame:
indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=4, offset=-3)
df['rolling value'] = (
df.groupby('movement')['value'].rolling(window=indexer).sum().droplevel(0)
)
df
:
movement value rolling value
0 right 2 7.0
1 right 1 6.0
2 right 3 6.0
3 right 1 4.0
4 right 1 4.0
5 right 1 NaN
6 right 1 NaN
7 right 1 NaN
8 Left 5 12.0
9 Left 4 8.0
10 Left 2 5.0
11 Left 1 4.0
12 Left 1 4.0
13 Left 1 NaN
14 Left 1 NaN
15 Left 1 NaN
Just the produced Series:
indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=4, offset=-3)
print(df.groupby('movement')['value'].rolling(window=indexer).sum())
movement
Left 8 12.0
9 8.0
10 5.0
11 4.0
12 4.0
13 NaN
14 NaN
15 NaN
right 0 7.0
1 6.0
2 6.0
3 4.0
4 4.0
5 NaN
6 NaN
7 NaN
Name: value, dtype: float64
The first level (movement
) is the issue in assigning the values back to the DataFrame (and the reason why groupby
wouldn't work).
droplevel(0)
makes the Series :
8 12.0
9 8.0
10 5.0
11 4.0
12 4.0
13 NaN
14 NaN
15 NaN
0 7.0
1 6.0
2 6.0
3 4.0
4 4.0
5 NaN
6 NaN
7 NaN
Name: value, dtype: float64
This will align correctly with the DataFrame.
Slightly different DataFrame used to show the sums:
import pandas as pd
df = pd.DataFrame({
'movement': ['right', 'right', 'right', 'right', 'right', 'right', 'right',
'right', 'Left', 'Left', 'Left', 'Left', 'Left', 'Left',
'Left', 'Left'],
'value': [2, 1, 3, 1, 1, 1, 1, 1, 5, 4, 2, 1, 1, 1, 1, 1]
})
Upvotes: 1