Python pandas Dataframe : Delete all rows until the first occurrence of a certain value

Question

I have a pandas 'Dataframe' which looks looks like, also please let me know if you need pd.Dataframe to the below table.

iD      a   b   c
c1      2   3   4
c1      2   3   4
c1      2   3   4
c1      2   E   4
c1      2   3   4
c2      3   4   5
c2      3   4   5
c2      3   E   5
c2      3   4   5

now in this dataframe there are two IDs c1 and c2. I want to delete all the rows above whenever 'E' appears in column 'b'.

my final dataframe should look like

iD      a   b   c
c1      2   E   4
c1      2   3   4
c2      3   E   5
c2      3   4   5

Just trying to keep the question short for people to answer. Please let me know if i have to add some extra datapoints in dataframe

cs95 · Accepted Answer

Use groupby and cumsum on a mask of boolean values comparing the column "b" to the letter "E":

df[df.b.eq('E').groupby(df.iD).cumsum()]

   iD  a  b  c
3  c1  2  E  4
4  c1  2  3  4
7  c2  3  E  5
8  c2  3  4  5

df[df.b.eq('E').groupby(df.iD).cumsum()].reset_index(drop=True)

   iD  a  b  c
0  c1  2  E  4
1  c1  2  3  4
2  c2  3  E  5
3  c2  3  4  5

Python pandas Dataframe : Delete all rows until the first occurrence of a certain value

Answers (2)

Related Questions