Reputation: 330
Start with a sorted table:
Index | A | B | C |
0 | A1| 0 | Group 1 |
1 | A1| 0 | Group 1 |
2 | A1| 1 | Group 2 |
3 | A1| 1 | Group 2 |
4 | A1| 2 | Group 3 |
5 | A1| 2 | Group 3 |
6 | A2| 7 | Group 4 |
7 | A2| 7 | Group 4 |
Returns records 0,1,2,3,6,7
First I want to create groups based on Columns A and B. Then I want only the first two subgroups of a Column A group returned. I want all the records returned for the subgroup.
Thank you so much.
Upvotes: 1
Views: 204
Reputation: 294258
Use pd.factorize
within a groupby
and filter for less than 2
df[df.groupby('A').B.transform(lambda x: x.factorize()[0]).lt(2)]
# same as
# df[df.groupby('A').B.transform(lambda x: x.factorize()[0]) < 2]
A B C
0 A1 0 Group 1
1 A1 0 Group 1
2 A1 1 Group 2
3 A1 1 Group 2
6 A2 7 Group 4
7 A2 7 Group 4
Upvotes: 2