joshuah9
joshuah9

Reputation: 77

Issues using the rolling feature of pandas with a condition

I tried using rolling(4).sum().shift(-3) but I run into an issue where it keeps adding because I don't have a condition where it stops to check if the movement has changed. I tried groupby but it throws an error as well. Any advice?

movement value
right 2
right 1
right 3
right 1
right 1
right 1
right 1
right 1
Left 5
Left 4
Left 2
Left 1
Left 1
Left 1
Left 1
Left 1

What I want to get is the following:

movement value rolling value
right 2 7
right 1 6
right 3 6
right 1 4
right 1 4
right 1 nan
right 1 nan
right 1 nan
Left 5 12
Left 4 8
Left 2 5
Left 1 4
Left 1 4
Left 1 nan
Left 1 nan
Left 1 nan

Upvotes: 1

Views: 405

Answers (1)

Henry Ecker
Henry Ecker

Reputation: 35636

We can use a FixedForwardWindowIndexer with an offset of -3 as the window instead of shifting after the fact, and droplevel to remove the additional index in movement, but keep the index alignment of the DataFrame:

indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=4, offset=-3)
df['rolling value'] = (
    df.groupby('movement')['value'].rolling(window=indexer).sum().droplevel(0)
)

df:

   movement  value  rolling value
0     right      2            7.0
1     right      1            6.0
2     right      3            6.0
3     right      1            4.0
4     right      1            4.0
5     right      1            NaN
6     right      1            NaN
7     right      1            NaN
8      Left      5           12.0
9      Left      4            8.0
10     Left      2            5.0
11     Left      1            4.0
12     Left      1            4.0
13     Left      1            NaN
14     Left      1            NaN
15     Left      1            NaN

Just the produced Series:

indexer = pd.api.indexers.FixedForwardWindowIndexer(window_size=4, offset=-3)
print(df.groupby('movement')['value'].rolling(window=indexer).sum())
movement    
Left      8     12.0
          9      8.0
          10     5.0
          11     4.0
          12     4.0
          13     NaN
          14     NaN
          15     NaN
right     0      7.0
          1      6.0
          2      6.0
          3      4.0
          4      4.0
          5      NaN
          6      NaN
          7      NaN
Name: value, dtype: float64

The first level (movement) is the issue in assigning the values back to the DataFrame (and the reason why groupby wouldn't work).

droplevel(0) makes the Series :

8     12.0
9      8.0
10     5.0
11     4.0
12     4.0
13     NaN
14     NaN
15     NaN
0      7.0
1      6.0
2      6.0
3      4.0
4      4.0
5      NaN
6      NaN
7      NaN
Name: value, dtype: float64

This will align correctly with the DataFrame.


Slightly different DataFrame used to show the sums:

import pandas as pd

df = pd.DataFrame({
    'movement': ['right', 'right', 'right', 'right', 'right', 'right', 'right',
                 'right', 'Left', 'Left', 'Left', 'Left', 'Left', 'Left',
                 'Left', 'Left'],
    'value': [2, 1, 3, 1, 1, 1, 1, 1, 5, 4, 2, 1, 1, 1, 1, 1]
})

Upvotes: 1

Related Questions