Reputation: 47
I have the following dataframe:
index | A | B |
---|---|---|
0 | a | 3 |
1 | a | 4 |
2 | b | 9 |
3 | b | 6 |
4 | a | 2 |
5 | b | 1 |
And I would like to get the second last value of each group of column "A". I already figured out how to get the min() value with a groupby :
df_grouped_last = df.groupby('A').agg({'B': ['min']})
But I need to get the second last value ("before last") so I can get :
index | A | 2nd last B |
---|---|---|
0 | a | 3 |
1 | b | 6 |
I will also need the third last and fourth in another work.
Any chance someone know how to code it ?
Thanks a lot ! Vincent
Upvotes: 1
Views: 2183
Reputation: 28644
Looking at your expected output, the assumption is that column B
is sorted for each group. If that is the case, use sort_values, combined with nth:
(df.sort_values(['A', 'B'])
.groupby('A', sort = False)
.B
.nth(-2) # familiar python construct ...
# takes second value from the bottom, per group
.reset_index()
)
A B
0 a 3
1 b 6
Upvotes: 1
Reputation: 2128
Use:
df = (df.groupby('A', as_index = False)['B']
.agg({'2nd last B': lambda x: x.iloc[-2] if len(x) > 1 else x}))
Output:
>>> df
A 2nd last B
0 a 4
1 b 6
Upvotes: 0
Reputation: 323236
Let us try sort_values
then use position
out = df.sort_values('B').groupby('A').apply(lambda x : x.iloc[1])
Out[68]:
index A B
A
a 0 a 3
b 3 b 6
Upvotes: 2