Zanam
Zanam

Reputation: 4807

Pandas drop duplicates when values in other columns are same

I have the following dataframe df:

Col1    Col2    Val
T1      L2      1
T1      L2      1
T1      G3      3
G3      G3      4
G3      G3      6
G3      L2      7
L2      L2      8
L2      L2      9

I want to get the following:

Col1    Col2    Val
T1      L2      1
T1      L2      1
T1      G3      3
G3      G3      4
G3      L2      7
L2      L2      8

I want to drop duplicate rows based on column Val only when Col1 and Col2 are same.

I am trying to use the following df.drop_duplicates(subset='Val', keep='first') but have been unable to add the condition on Col1 and Col2.

I am not sure how to approach the above.

Upvotes: 1

Views: 331

Answers (1)

Erfan
Erfan

Reputation: 42916

We can check for rows where Col1 == Col2 and where both columns are duplicated:

df[~(df["Col1"].eq(df["Col2"]) & df.duplicated(subset=["Col1", "Col2"]))]
  Col1 Col2  Val
0   T1   L2    1
1   T1   L2    1
2   T1   G3    3
3   G3   G3    4
5   G3   L2    7
6   L2   L2    8

Upvotes: 2

Related Questions