Drop NaN containing rows in pandas DataFrame with column condition

Question

I have a dataframe with Columns A,B,D and C. I would like to drop all NaN containing rows in the dataframe only where D and C columns contain value 0.

Eg:

Would anyone be able to help me in this issue.

Thanks & Best Regards

Michael

jezrael · Accepted Answer

Use boolean indexing with inverted mask by ~:

np.random.seed(2021)
    
df = pd.DataFrame(np.random.choice([1,0,np.nan], size=(10, 4)), columns=list('ABCD'))
print (df)
     A    B    C    D
0  1.0  0.0  0.0  1.0
1  0.0  NaN  NaN  1.0
2  NaN  0.0  0.0  0.0
3  1.0  1.0  NaN  NaN
4  NaN  NaN  0.0  0.0
5  0.0  NaN  0.0  1.0
6  0.0  NaN  NaN  1.0
7  0.0  1.0  NaN  NaN
8  1.0  0.0  1.0  0.0
9  0.0  NaN  NaN  NaN

If need remove columns if both D and C has 0 and another columns has NaNs use DataFrame.all for test if both values are 0 and chain by & for bitwise AND with DataFrame.any for test if at least one value is NaN tested by DataFrame.isna:

m = df[['D','C']].eq(0).all(axis=1) & df.isna().any(axis=1)

df1 = df[~m]
print (df1)
     A    B    C    D
0  1.0  0.0  0.0  1.0
1  0.0  NaN  NaN  1.0
3  1.0  1.0  NaN  NaN
5  0.0  NaN  0.0  1.0
6  0.0  NaN  NaN  1.0
7  0.0  1.0  NaN  NaN
8  1.0  0.0  1.0  0.0
9  0.0  NaN  NaN  NaN

Another alternative without ~ for invert, but all conditions and also & is changed to | for bitwise OR:

m = df[['D','C']].ne(0).any(axis=1) | df.notna().all(axis=1)

df1 = df[m]
print (df1)
     A    B    C    D
0  1.0  0.0  0.0  1.0
1  0.0  NaN  NaN  1.0
3  1.0  1.0  NaN  NaN
5  0.0  NaN  0.0  1.0
6  0.0  NaN  NaN  1.0
7  0.0  1.0  NaN  NaN
8  1.0  0.0  1.0  0.0
9  0.0  NaN  NaN  NaN

Drop NaN containing rows in pandas DataFrame with column condition

Answers (1)

Related Questions