Reputation: 343
This gives Percentile Values for a column below,
df[column].rank(pct=True)
But is there a way to get the rolling percentile Values for a column by using this similar method, without writing functions or loops and longer code? Currently when adding the .rolling(window)
, it is not working.
If there is not, Pandas desperately needs to add this as a feature for the .rolling()
, it is there for .quantile()
, it makes no sense to not have it for the percentiles as well.
Upvotes: 3
Views: 1792
Reputation: 444
While waiting for Rolling rank to be added in pandas 1.4.0, one way to do this could be like so :
import pandas as pd
df[column].rolling(window).apply(lambda x: len(x[x <= x.iloc[-1]]) / len(x))
Where window is the window
on which you sought to roll. This should give you the same result as if you were using df[column].rank(pct=True)
on a dataframe equal to the window at hand.
The lambda function divides the number of occurences under or equal the last value by the total number of value, giving us the percentage of scores fall below the current row.
Note: Values under the window size will return NaN
. For example the first value for a window=2
.
Example :
n = [5, 10, 9, 7, 6, 9, 4]
df = pd.DataFrame(n, columns=['Scores'])
df['rank_result'] = df['Scores'].rank(pct=True)
df['rolling_apply'] = df['Scores'].rolling(7).apply(lambda x: len(x[x <= x.iloc[-1]]) / len(x))
print(df[6:7]) # get the last row to compare
This will give the following output for the last row :
Scores rank_result rolling_apply
6 4 0.142857 0.142857
Upvotes: 1