Reputation: 127
I have a dataframe like this:
A B C
12 true 1
12 true 1
3 nan 2
3 nan 3
I would like to drop all rows where the value of column A is duplicate but only if the value of column B is 'true'.
The resulting dataframe I have in mind is:
A B C
12 true 1
3 nan 2
3 nan 3
I tried using: df.loc[df['B']=='true'].drop_duplicates('A', inplace=True, keep='first')
but it doesn't seem to work.
Thanks for your help!
Upvotes: 10
Views: 2863
Reputation: 294218
df[df.B.ne(True) | ~df.A.duplicated()]
A B C
0 12 True 1
2 3 NaN 2
3 3 NaN 3
Upvotes: 5
Reputation: 323226
You can sue pd.concat
split the df by B
df=pd.concat([df.loc[df.B!=True],df.loc[df.B==True].drop_duplicates(['A'],keep='first')]).sort_index()
df
Out[1593]:
A B C
0 12 True 1
2 3 NaN 2
3 3 NaN 3
Upvotes: 11