Strictly Funk
Strictly Funk

Reputation: 358

How do i calculate a rolling sum by group with monthly data in Python?

I am trying to use rolling().sum() to create a dataframe with 2-month rolling sums within each 'type'. Here's what my data looks like:

import pandas as pd
df = pd.DataFrame({'type': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C'],
                   'date': ['2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01',
                            '2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01',
                            '2022-01-01', '2022-02-01', '2022-03-01', '2022-04-01'],
                   'value': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]})

and here is the expected result:

expected result

and here is what I have tried (unsuccessfully):

rolling_sum = df.groupby(['date', 'type']).rolling(2).sum().reset_index()

Upvotes: 1

Views: 1405

Answers (3)

thedatastrategist
thedatastrategist

Reputation: 445

You're pretty close -- just need to specify the level and set drop=True on reset_index(). Also, you can remove date from the groupby, since it's just grouped on type. You should use sort_values to ensure the input dataframe is in the correct order for the rolling sum.

df = df.sort_values(by=['type', 'date'])
df['rolling_sum'] = df.groupby(['type']).rolling(2).sum().reset_index(level=0, drop=True)

Upvotes: 0

DataFramed
DataFramed

Reputation: 1631

Answer

The below code does the work.

rolling_sum = df.groupby(['type']).rolling(2).sum()

Upvotes: 0

constantstranger
constantstranger

Reputation: 9379

Here's a way to do it:

rolling_sum = (
    df.assign(value=df.groupby(['type'])['value']
    .rolling(2, min_periods=1).sum().reset_index()['value'])
)

Output:

   type        date  value
0     A  2022-01-01    1.0
1     A  2022-02-01    3.0
2     A  2022-03-01    5.0
3     A  2022-04-01    7.0
4     B  2022-01-01    5.0
5     B  2022-02-01   11.0
6     B  2022-03-01   13.0
7     B  2022-04-01   15.0
8     C  2022-01-01    9.0
9     C  2022-02-01   19.0
10    C  2022-03-01   21.0
11    C  2022-04-01   23.0

Explanation:

  • Use groupby() only on type (without date) so that all dates are in the group for a given type
  • the min_periods argument ensures rolling works even for the first row where 2 periods are not available
  • Use assign() to update the value column using index alignment.

Upvotes: 3

Related Questions