Keep until last occurrence of value per group

Question

Here is a simplified example of my pandas dataframe:

     User  Binary
0   UserA       0
1   UserA       0
2   UserA       0
3   UserA       1
4   UserA       0
5   UserA       1
6   UserA       0
7   UserA       0
8   UserB       0
9   UserB       0
10  UserB       0
11  UserB       0
12  UserB       0
13  UserB       1
14  UserB       1
15  UserB       0
16  UserC       0
17  UserC       0

For each User, I would like to remove all rows after the first occurrence of Binary=1. Note, there will be some Users that have no cases of Binary=1, e.g. UserC in this example.

Output would look like below:

     User  Binary
0   UserA       0
1   UserA       0
2   UserA       0
3   UserA       1
8   UserB       0
9   UserB       0
10  UserB       0
11  UserB       0
12  UserB       0
13  UserB       1
16  UserC       0
17  UserC       0

yatu · Accepted Answer

Here's one approach using groupby and transforming with a custom function:

# check which Binary values are 1 and group the series by User
g = df.Binary.eq(1).groupby(df.User)
# transform to either idxmax or the last index depending
# on whether there are any Trues or not
m = g.transform(lambda x: x.idxmax() if x.any() else x.index[-1])
# index the dataframe where the index is smaler or eq m
out = df[df.index <= m]

print(out)

     User  Binary
0   UserA       0
1   UserA       0
2   UserA       0
3   UserA       1
8   UserB       0
9   UserB       0
10  UserB       0
11  UserB       0
12  UserB       0
13  UserB       1
16  UserC       0
17  UserC       0

Keep until last occurrence of value per group

Answers (2)

Related Questions