TylerNG
TylerNG

Reputation: 941

Pandas deleting rows in order

Given a particular df:

ID Text
1  abc
1  xyz
2  xyz
2  abc
3  xyz
3  abc
3  ijk
4  xyz

I want to apply condition where: Grouping by ID, if abc exists then delete row with xyz. The outcome would be:

ID Text
1  abc
2  abc
3  abc
3  ijk
4  xyz

Usually I would group them by Id and apply np.where(...). However, I don't think this approach would work for this case since it's based on rows.
Many thanks!

Upvotes: 2

Views: 60

Answers (2)

BENY
BENY

Reputation: 323316

I am using crosstab

s=pd.crosstab(df.ID,df.Text)
s.xyz=s.xyz.mask(s.abc.eq(1)&s.xyz.eq(1))
s
Out[162]:
Text  abc  ijk  xyz
ID
1       1    0  NaN
2       1    0  NaN
3       1    1  NaN
4       0    0  1.0
s.replace(0,np.nan).stack().reset_index().drop(0,1)
Out[167]: 
   ID Text
0   1  abc
1   2  abc
2   3  abc
3   3  ijk
4   4  xyz

Upvotes: 2

cs95
cs95

Reputation: 402814

To the best of my knowledge, you can vectorize this with a groupby + transform:

df[~(df.Text.eq('abc').groupby(df.ID).transform('any') & df.Text.eq('xyz'))]

   ID Text
0   1  abc
3   2  abc
5   3  abc
6   3  ijk
7   4  xyz

Upvotes: 2

Related Questions