Reputation: 1010
I need to get the rolling 2nd largest value of a df.
To get the largest value I do
max = df.sort_index(ascending=True).rolling(10).max()
When I try this, python throws an error
max = df.sort_index(ascending=True).rolling(10).nlargest(2)
AttributeError: 'Rolling' object has no attribute 'nlargest'
Is this a bug? What else can I use that is performant?
Upvotes: 3
Views: 1867
Reputation: 863226
Use np.sort in descending order and select second value:
np.random.seed(2019)
df = pd.DataFrame({
'B': np.random.randint(20, size=15)
})
print (df)
B
0 8
1 18
2 5
3 15
4 12
5 10
6 16
7 16
8 7
9 5
10 19
11 12
12 16
13 18
14 5
a = df.rolling(10).apply(lambda x: -np.sort(-x)[1])
#alternative
#a = df.rolling(10).apply(lambda x: np.sort(x)[-2])
print (a)
B
0 NaN
1 NaN
2 NaN
3 NaN
4 NaN
5 NaN
6 NaN
7 NaN
8 NaN
9 16.0
10 18.0
11 16.0
12 16.0
13 18.0
14 18.0
Upvotes: 3
Reputation: 2905
I'd do something like this:
df.rolling(10).apply(lambda x: pd.Series(x).nlargest(2).iloc[-1])
Upvotes: 6