Reputation: 93
I would like to take rolling stats over a fixed period say 5 day, eg
DATE Price
ID
AAPL US Equity 2015-01-02 109.33
AAPL US Equity 2015-01-05 106.25
AAPL US Equity 2015-01-06 106.26
AAPL US Equity 2015-01-07 107.75
AAPL US Equity 2015-01-08 111.89
AAPL US Equity 2015-01-09 112.01
AAPL US Equity 2015-01-12 109.25
AAPL US Equity 2015-01-13 110.22
AAPL US Equity 2015-01-14 109.80
AAPL US Equity 2015-01-15 106.82
doing
df['Average']=df['Price'].rolling(5).mean()
to give
DATE Price Average
ID
AAPL US Equity 2015-01-02 109.33 NaN
AAPL US Equity 2015-01-05 106.25 NaN
AAPL US Equity 2015-01-06 106.26 NaN
AAPL US Equity 2015-01-07 107.75 NaN
AAPL US Equity 2015-01-08 111.89 108.296
AAPL US Equity 2015-01-09 112.01 108.832
AAPL US Equity 2015-01-12 109.25 109.432
AAPL US Equity 2015-01-13 110.22 110.224
AAPL US Equity 2015-01-14 109.80 110.634
AAPL US Equity 2015-01-15 106.82 109.620
how can i modify to apply any function to get any rolling stats over the fixed period but for the first few rows over the available data ie for first row average will be first day price, for first two rows average will be first two day price etc
I know I can use iterrows to achieve this in this one of 'average' value case, but ideally I would like to use it for any stat like quantile,std etc
Something like
df['Average']=my_rolling_stat(df['Price'],period=5,function='mean')
df['Stdev']=my_rolling_stat(df['Price'],period=10,function='std')
df['95_Perc']=my_rolling_stat(df['Price'],period=10,function='quantile',quantile_value=0.95)
Thanks in advance
Upvotes: 1
Views: 94
Reputation: 93141
To add to Scott Boston's answer, you can define your rolling stat function as:
def my_rolling_stat(series, period, function, **kwargs):
window = series.rolling(period, min_periods=1)
func = getattr(window, function)
return func(**kwargs)
Usage:
my_rolling_stat(df['Price'], period=5, function='mean')
my_rolling_stat(df['Price'], period=10, function='std')
my_rolling_stat(df['Price'], period=10, function='quantile', quantile=0.95)
The list of functions and their arguments can be found from the Computation / Descriptive stats list.
Upvotes: 2
Reputation: 153460
IIUC, use min_periods
parameter in rolling
:
df['Average']=df['Price'].rolling(5, min_periods=1).mean()
Output:
0 109.3300
1 107.7900
2 107.2800
3 107.3975
4 108.2960
5 108.8320
6 109.4320
7 110.2240
8 110.6340
9 109.6200
Name: PRICE, dtype: float64
Upvotes: 4