Dropping duplicates based on other column values (Python)

Question

I have a dataframe with 3 columns. I would like to drop duplicates in column A based on values in other columns. I have searched tirelessly and cant find a solution like this.

example:

A	B	C
Family1	nan	nan
Family1	nan	1234
Family1	1245	nan
Family1	3456	78787
Family2	nan	nan
Family3	nan	nan

Basically i want to drop a duplicate ONLY IF the rest of the columns are both nan. otherwise, the duplicate can stay.

desired output:

A	B	C
Family1	nan	1234
Family1	1245	nan
Family1	3456	78787
Family2	nan	nan
Family3	nan	nan

Family2 and Family3 remain in the df because they dont have duplicates, even though both columns are nan

wwnde · Accepted Answer

You were not very clear. I suspect you want to drop any duplicates in column A if both columns B and C are NaN. If so, please try;

df[~(df.A.duplicated(keep=False)&(df.B.isna()&df.C.isna()))]

Dropping duplicates based on other column values (Python)

Answers (2)

Related Questions