William Bernard
William Bernard

Reputation: 357

Pandas Rolling Groupby Shift back 1, Trying to lag rolling sum

I am trying to get a rolling sum of the past 3 rows for the same ID but lagging this by 1 row. My attempt looked like the below code and i is the column. There has to be a way to do this but this method doesnt seem to work.

for i in df.columns.values:
    df.groupby('Id', group_keys=False)[i].rolling(window=3, min_periods=2).mean().shift(1)


id    dollars  lag

1      6       nan
1      7       nan
1      6       6.5
3      7       nan
3      4       nan
3      4       5.5
3      3       5
5      6       nan
5      5       nan
5      6       5.5
5      12      5.67
5      7       8.3

Upvotes: 3

Views: 3781

Answers (1)

C8H10N4O2
C8H10N4O2

Reputation: 19005

I am trying to get a rolling sum of the past 3 rows for the same ID but lagging this by 1 row.

You can create the lagged rolling sum by chaining DataFrame.groupby(ID), .shift(1) for the lag 1, .rolling(3) for the window 3, and .sum() for the sum.

Example: Let's say your dataset is:

import pandas as pd
# Reproducible datasets are your friend!
d = pd.DataFrame({'grp':pd.Series(['A']*4 + ['B']*5 + ['C']*6),
                  'x':pd.Series(range(15))})
print(d)
 grp   x
   A   0
   A   1
   A   2
   A   3
   B   4
   B   5
   B   6
   B   7
   B   8
   C   9
   C  10
   C  11
   C  12
   C  13
   C  14

I think what you're asking for is this:

d['y'] = d.groupby('grp')['x'].shift(1).rolling(3).sum()
print(d)
 grp   x     y
   A   0   NaN
   A   1   NaN
   A   2   NaN
   A   3   3.0
   B   4   NaN
   B   5   NaN
   B   6   NaN
   B   7  15.0
   B   8  18.0
   C   9   NaN
   C  10   NaN
   C  11   NaN
   C  12  30.0
   C  13  33.0
   C  14  36.0

Upvotes: 3

Related Questions