Reputation: 1987
Following is my dataframe. I am trying to calculate rolling 5 period percent rank of ATR
. RollingPercentRank
is my desired output.
symbol Day time ATR RollingPercentRank
316356 SPY 11/29/2018 10:35:00 0.377880 NaN
316357 SPY 11/29/2018 10:40:00 0.391092 NaN
316358 SPY 11/29/2018 10:45:00 0.392983 NaN
316359 SPY 11/29/2018 10:50:00 0.399685 NaN
316360 SPY 11/29/2018 10:55:00 0.392716 0.2
316361 SPY 11/29/2018 11:00:00 0.381445 0.2
316362 AAPL 11/29/2018 11:05:00 0.387300 NaN
316363 AAPL 11/29/2018 11:10:00 0.390570 NaN
316364 AAPL 11/29/2018 11:15:00 0.381313 NaN
316365 AAPL 11/29/2018 11:20:00 0.398182 NaN
316366 AAPL 11/29/2018 11:25:00 0.377364 0.6
316367 AAPL 11/29/2018 11:30:00 0.373627 0.2
As of the 5th row, I want to apply the percent rank function to all 5 previous values(1st row to 5th row) of ATR
within a group. And as of the 6th row, I want to again apply the rank function to all 5 previous values(2nd row to 6th row) of ATR
.
I have tried the following which gives a "'numpy.ndarray' object has no attribute 'rank' " error.
df['RollingPercentRank'] = df.groupby(['symbol'])['ATR'].rolling(window=5,min_periods=5,center=False).apply(lambda x: x.rank(pct=True)).reset_index(drop=True)
Upvotes: 2
Views: 3771
Reputation: 29635
IIUC as I don't get the expected output you showed, but to use rank
, you need a pd.Series
and then you only want the last value of this percentage Series of 5 elements so it would be:
print (df.groupby(['symbol'])['ATR']
.rolling(window=5,min_periods=5,center=False)
.apply(lambda x: pd.Series(x).rank(pct=True).iloc[-1]))
symbol i
AAPL 316362 NaN
316363 NaN
316364 NaN
316365 NaN
316366 0.2
316367 0.2
SPY 316356 NaN
316357 NaN
316358 NaN
316359 NaN
316360 0.6
316361 0.2
Because x
ix a numpy
array, it is possible to get the same result using twice argsort
and to create the column, a reset_index
at the end:
win_val = 5
df['RollingPercentRank'] = (df.groupby(['symbol'])['ATR']
.rolling(window=win_val,min_periods=5,center=False)
.apply(lambda x: x.argsort().argsort()[-1]+1)
.reset_index(level=0,drop=True)/win_val)
print (df)
symbol Day time ATR RollingPercentRank
316356 SPY 11/29/2018 10:35:00 0.377880 NaN
316357 SPY 11/29/2018 10:40:00 0.391092 NaN
316358 SPY 11/29/2018 10:45:00 0.392983 NaN
316359 SPY 11/29/2018 10:50:00 0.399685 NaN
316360 SPY 11/29/2018 10:55:00 0.392716 0.6
316361 SPY 11/29/2018 11:00:00 0.381445 0.2
316362 AAPL 11/29/2018 11:05:00 0.387300 NaN
316363 AAPL 11/29/2018 11:10:00 0.390570 NaN
316364 AAPL 11/29/2018 11:15:00 0.381313 NaN
316365 AAPL 11/29/2018 11:20:00 0.398182 NaN
316366 AAPL 11/29/2018 11:25:00 0.377364 0.2
316367 AAPL 11/29/2018 11:30:00 0.373627 0.2
Upvotes: 4