Reputation: 173
I have below dataframe structure as a sample.
I want to obtain a column where it calculates the percentile of the "price column" based on the value of the "percentile" column, using a rolling n period lookback.
Is it possible? I tried using some kind of a lambda function and use the .apply syntax but couldn't get it to work.
date percentile price desired_row
2019-11-08 0.355556 0.6863 36th percentile of price of last n period
2019-11-11 0.316667 0.6851 32nd percentile of price of last n period
2019-11-12 0.305556 0.6841 ...
2019-11-13 0.302778 0.6838 ...
2019-11-14 0.244444 0.6798 ...
Thanks!!
Upvotes: 1
Views: 258
Reputation: 29635
Based on this answer, you can use rolling
on the column price with the column percentile in index and then use quantile
in apply
with the parameter raw=False
:
window = 3
df['desired_row'] = df.set_index('percentile')['price'].rolling(window)\
.apply(lambda x: x.quantile(q=x.index[-1]), raw=False).values
print (df)
date percentile price desired_row
0 2019-11-08 0.355556 0.6863 NaN
1 2019-11-11 0.316667 0.6851 NaN
2 2019-11-12 0.305556 0.6841 0.684711
3 2019-11-13 0.302778 0.6838 0.683982
4 2019-11-14 0.244444 0.6798 0.681756
you can change the interpolation
parameter in quantile
depending on your need.
Upvotes: 1
Reputation: 120
You can use the rolling method in pandas. For example:
df = pd.DataFrame({'B': [0, 1, 2, 2, 4]})
df['rolling_mean'] = df['B'].rolling(2).mean()
will create a new column of the two period rolling average of the 'B' column. You can apply different methods if you need to compute a different summary statistic, for example:
df['rolling_sum'] = df['B'].rolling(2).sum()
See for more on the functionality: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rolling.html
Upvotes: 0