Reputation: 417
Apologies I am a noob looking to transition from R!
Reproducible data example;
df = pd.DataFrame(1.26 + np.random.rand(size)/100.0,
index=pd.date_range('20160101 09:00:00',
periods=size,
freq='60s'),
columns=['ATR20'])
I would like to take a value in the column ATR20 and compute its current percentile against rolling window of the previous n values of column ATR20.
Heres as far as I got:
for n in range(1,len(df)):
print(sum(df.ATR20[n:n+20] > df.ATR20))
Which gives the following error:
ValueError: Can only compare identically-labeled Series objects
I am sure that there is an easier way to do this with neater syntax but my searches haven't found anything close enough.
Thanks.
Upvotes: 3
Views: 575
Reputation: 12918
I think this will get what you're looking for. I used your example with only 10 elements in the dataframe and a rolling window of 3 elements, just so it is shorter when pasted here.
size = 10
df = pd.DataFrame(1.26 + np.random.rand(size)/100.0,
index=pd.date_range('20160101 09:00:00',
periods=size,
freq='60s'),
columns=['ATR20'])
# ATR20
# 2016-01-01 09:00:00 1.262522
# 2016-01-01 09:01:00 1.265116
# 2016-01-01 09:02:00 1.265051
# 2016-01-01 09:03:00 1.261109
# 2016-01-01 09:04:00 1.262699
# 2016-01-01 09:05:00 1.266710
# 2016-01-01 09:06:00 1.260186
# 2016-01-01 09:07:00 1.268001
# 2016-01-01 09:08:00 1.263227
# 2016-01-01 09:09:00 1.261331
You can use df.rolling
along with apply
to apply a function to a rolling window. I think the calculation I used here is more or less what you are looking for:
window = 3
df.rolling(window).apply(lambda x: sum([x[-1] > i for i in x]) / (window - 1))
# ATR20
# 2016-01-01 09:00:00 NaN
# 2016-01-01 09:01:00 NaN
# 2016-01-01 09:02:00 0.263665
# 2016-01-01 09:03:00 0.000000
# 2016-01-01 09:04:00 0.263932
# 2016-01-01 09:05:00 0.527632
# 2016-01-01 09:06:00 0.000000
# 2016-01-01 09:07:00 0.527024
# 2016-01-01 09:08:00 0.263754
# 2016-01-01 09:09:00 0.000000
Upvotes: 1