Maxima
Maxima

Reputation: 352

Remove rows after occurence of a specific value the first time

My data frame -

Col_A Col_B
101 1
101 2
101 3
101 4
101 1
101 2
102 1
102 2
102 3
102 2

I want to drop the rows after the first occurrence of "3" from Col_B w.r.t Col_A.

Desired output -

Col_A Col_B
101 1
101 2
101 3
102 1
102 2
102 3

How can I achieve this?

Upvotes: 2

Views: 58

Answers (2)

AlexK
AlexK

Reputation: 3011

A more brute-force type of solution, but has good readability. In each group, it finds the index of the first occurrence of 3 and takes all rows up to and including the row with the identified index.

def remove_rows(df):
   df = df.reset_index(drop=True)
   if df[df['Col_B'] == 3].index.empty:
      return df
   return df.loc[:df[df['Col_B'] == 3].index[0]]

result = df.groupby('Col_A').apply(lambda x: remove_rows(x)).reset_index(drop=True)
print(result)

   Col_A  Col_B
0    101      1
1    101      2
2    101      3
3    102      1
4    102      2
5    102      3

Upvotes: 0

jezrael
jezrael

Reputation: 862591

You can compare values by 3with change order by iloc with pass to GroupBy.cummax:

df = df[df['Col_B'].iloc[::-1].eq(3).groupby(df['Col_A']).cummax().iloc[::-1]]
print (df)
   Col_A  Col_B
0    101      1
1    101      2
2    101      3
6    102      1
7    102      2
8    102      3

Upvotes: 1

Related Questions