Mak_3000
Mak_3000

Reputation: 23

pandas rolling apply on a custom function

I would like to apply pandas.rank on a rolling basis. I tried to used pandas.rolling.apply but unfortunately rolling doesn't work with 'rank'.

Is there a way around?

df = pd.DataFrame(np.random.randn(10, 3))

def my_rank(x):
   return x.rank(pct=True)

df.rolling(3).apply(my_rank)

Upvotes: 1

Views: 3179

Answers (1)

Max Power
Max Power

Reputation: 8954

Code:

def my_rank(x):
   return pd.Series(x).rank(pct=True).iloc[-1]

df.rolling(3).apply(my_rank)

Output:

          0         1         2
0       NaN       NaN       NaN
1       NaN       NaN       NaN
2  0.666667  0.333333  0.666667
3  1.000000  0.333333  1.000000
4  0.666667  1.000000  0.333333
5  0.333333  0.666667  0.666667
6  1.000000  0.333333  0.666667
7  0.333333  0.333333  1.000000
8  1.000000  0.666667  1.000000
9  0.666667  1.000000  0.666667

Explanation:

Your code (great minimal reproduceable example btw!) threw the following error: AttributeError: 'numpy.ndarray' object has no attribute 'rank'. Which meant the x in your my_rank function was getting passed as a numpy array, not a pandas Series. So first I updated return x.rank... to return pd.Series(x).rank..

Then I got the following error: TypeError: cannot convert the series to <class 'float'> Which makes sense, because pd.Series.rank takes a series of n numbers and returns a (ranked) series of n numbers. But since we're calling rank not once on a series, but repeatedly on a rolling window of a series, we only want one number as output for each rolling calculation. Hence the iloc[-1]

Upvotes: 2

Related Questions