Gerry
Gerry

Reputation: 626

Groupby and rolling by monthly mean and stdev

data = {‘A’:list(range(250)), ‘B’: list(range(251,501))}
dates = pd.date_range(‘2023-01-17’, period=250, freq=‘D’)
df = pd.DataFrame(data, index=dates)

I want rolling 2 calendar month mean as follows:

                 A      B
2023-02-28    21.0  272.0
2023-03-31    44.0  295.0
2023-04-30    73.0  324.0
…
2023-08-31   222.5  473.5

And rolling 2 calendar months stdev

                  A        B
2023-02-28   12.556   12.556
2023-03-31   17.175   17.175
2023-04-30   17.175   17.175
…
2023-08-31   15.732   15.732

Note: Any column can have NaN

Upvotes: 0

Views: 53

Answers (2)

mozway
mozway

Reputation: 262214

What you want is not really a rolling operation this is a mix of rolling and aggregation, which unfortunately is not straightforward with non associative computations like mean/std.

What you could do is to resample as many times as you have windows, shifting the start of computations for each step:

N = 2

# min period
p = df.index.min().to_period('M')

out = pd.concat([df.reindex(df.index.union([(p-n).start_time]))
                   .resample(f'{N}ME', origin='end_day').agg(['mean', 'std'])
                 for n in range(N)
                ]).sort_index().iloc[N-1:]

Output:

                A                 B           
             mean        std   mean        std
2023-01-31    7.0   4.472136  258.0   4.472136
2023-02-28   21.0  12.556539  272.0  12.556539
2023-03-31   44.0  17.175564  295.0  17.175564
2023-04-30   73.0  17.752934  324.0  17.752934
2023-05-31  104.0  17.752934  355.0  17.752934
2023-06-30  134.0  17.752934  385.0  17.752934
2023-07-31  165.0  17.752934  416.0  17.752934
2023-08-31  195.5  18.041619  446.5  18.041619
2023-09-30  222.5  15.732133  473.5  15.732133
2023-10-31  238.0   6.782330  489.0   6.782330

Upvotes: 0

Kelo
Kelo

Reputation: 169

To achieve what you want, I think you just need to resample the data monthly and apply a rolling window of 2 to calculate the mean and standard deviation. Try with the following code:

rolling_mean = df.resample('M').mean().rolling(window=2).mean()
rolling_std = df.resample('M').mean().rolling(window=2).std()

Upvotes: 0

Related Questions