IceAloe
IceAloe

Reputation: 519

python - keep duplicates if two columns equal

I have a dataset that looks like below:

col1.  col2.  col3. 
a      b       c
a      d       x
b      c       e
s      f       e
f      f       e

I need to drop duplicates in col3 if col1 differs from col2. The result looks like:

col1.  col2.  col3. 
a      b       c
a      d       x
f      f       e

Is there a way to nest this condition in df = df.drop_duplicates(subset=['col3'])?

Upvotes: 4

Views: 56

Answers (1)

BENY
BENY

Reputation: 323276

Yes we can do argsort

df = df.iloc[df.eval('col1==col2').argsort()].drop_duplicates('col3',keep='last')
  col1 col2 col3
0    a    b    c
1    a    d    x
4    f    f    e

Upvotes: 3

Related Questions