Reputation: 352
My data frame -
Col_A | Col_B |
---|---|
101 | 1 |
101 | 2 |
101 | 3 |
101 | 4 |
101 | 1 |
101 | 2 |
102 | 1 |
102 | 2 |
102 | 3 |
102 | 2 |
I want to drop the rows after the first occurrence of "3" from Col_B w.r.t Col_A.
Desired output -
Col_A | Col_B |
---|---|
101 | 1 |
101 | 2 |
101 | 3 |
102 | 1 |
102 | 2 |
102 | 3 |
How can I achieve this?
Upvotes: 2
Views: 58
Reputation: 3011
A more brute-force type of solution, but has good readability. In each group, it finds the index of the first occurrence of 3
and takes all rows up to and including the row with the identified index.
def remove_rows(df):
df = df.reset_index(drop=True)
if df[df['Col_B'] == 3].index.empty:
return df
return df.loc[:df[df['Col_B'] == 3].index[0]]
result = df.groupby('Col_A').apply(lambda x: remove_rows(x)).reset_index(drop=True)
print(result)
Col_A Col_B
0 101 1
1 101 2
2 101 3
3 102 1
4 102 2
5 102 3
Upvotes: 0
Reputation: 862591
You can compare values by 3
with change order by iloc
with pass to GroupBy.cummax
:
df = df[df['Col_B'].iloc[::-1].eq(3).groupby(df['Col_A']).cummax().iloc[::-1]]
print (df)
Col_A Col_B
0 101 1
1 101 2
2 101 3
6 102 1
7 102 2
8 102 3
Upvotes: 1