Reputation: 941
Given a particular df:
ID Text
1 abc
1 xyz
2 xyz
2 abc
3 xyz
3 abc
3 ijk
4 xyz
I want to apply condition where: Grouping by ID, if abc exists then delete row with xyz. The outcome would be:
ID Text
1 abc
2 abc
3 abc
3 ijk
4 xyz
Usually I would group them by Id and apply np.where(...). However, I don't think this approach would work for this case since it's based on rows.
Many thanks!
Upvotes: 2
Views: 60
Reputation: 323316
I am using crosstab
s=pd.crosstab(df.ID,df.Text)
s.xyz=s.xyz.mask(s.abc.eq(1)&s.xyz.eq(1))
s
Out[162]:
Text abc ijk xyz
ID
1 1 0 NaN
2 1 0 NaN
3 1 1 NaN
4 0 0 1.0
s.replace(0,np.nan).stack().reset_index().drop(0,1)
Out[167]:
ID Text
0 1 abc
1 2 abc
2 3 abc
3 3 ijk
4 4 xyz
Upvotes: 2
Reputation: 402814
To the best of my knowledge, you can vectorize this with a groupby
+ transform
:
df[~(df.Text.eq('abc').groupby(df.ID).transform('any') & df.Text.eq('xyz'))]
ID Text
0 1 abc
3 2 abc
5 3 abc
6 3 ijk
7 4 xyz
Upvotes: 2