Reputation: 4807
I have the following dataframe df
:
Col1 Col2 Val
T1 L2 1
T1 L2 1
T1 G3 3
G3 G3 4
G3 G3 6
G3 L2 7
L2 L2 8
L2 L2 9
I want to get the following:
Col1 Col2 Val
T1 L2 1
T1 L2 1
T1 G3 3
G3 G3 4
G3 L2 7
L2 L2 8
I want to drop duplicate rows based on column Val
only when Col1
and Col2
are same.
I am trying to use the following df.drop_duplicates(subset='Val', keep='first')
but have been unable to add the condition on Col1
and Col2
.
I am not sure how to approach the above.
Upvotes: 1
Views: 331
Reputation: 42916
We can check for rows where Col1 == Col2
and where both columns are duplicated:
df[~(df["Col1"].eq(df["Col2"]) & df.duplicated(subset=["Col1", "Col2"]))]
Col1 Col2 Val
0 T1 L2 1
1 T1 L2 1
2 T1 G3 3
3 G3 G3 4
5 G3 L2 7
6 L2 L2 8
Upvotes: 2