Reputation: 4292
I have a Pandas dataframe that includes a grouping variable. An example can be produced using:
df = pd.DataFrame({'grp':['a','a','b','b','b','c','d','d','d','d'],
'data':[4,5,3,6,7,8,9,8,7,3]})
...which looks like:
grp data
0 a 4
1 a 5
2 b 3
3 b 6
4 b 7
5 c 8
6 d 9
7 d 8
8 d 7
9 d 3
I can retrieve the last two rows of each group using:
dfgrp = df.groupby('grp').tail(2)
However, I would like to produce a mask that identifies the last two rows (or 1 row if only 1 exists), ideally producing an output that looks like:
0 True
1 True
2 False
3 True
4 True
5 True
6 False
7 False
8 True
9 True
I thought this would be relatively straight-forward but I haven't been able to find the solution. Suggestions would be greatly appreciated.
Upvotes: 0
Views: 340
Reputation: 4224
If your index is unique, you could do this by using isin
.
import pandas as pd
df = pd.DataFrame({'grp':['a','a','b','b','b','c','d','d','d','d'],
'data':[4,5,3,6,7,8,9,8,7,3]})
df['mask'] = df.index.isin(df.groupby('grp').tail(2).index)
df
grp data mask
0 a 4 True
1 a 5 True
2 b 3 False
3 b 6 True
4 b 7 True
5 c 8 True
6 d 9 False
7 d 8 False
8 d 7 True
9 d 3 True
Upvotes: 5