Reputation: 683
I have a dataframe with Columns A,B,D and C. I would like to drop all NaN containing rows in the dataframe only where D and C columns contain value 0.
Would anyone be able to help me in this issue.
Thanks & Best Regards
Michael
Upvotes: 1
Views: 47
Reputation: 862396
Use boolean indexing
with inverted mask by ~
:
np.random.seed(2021)
df = pd.DataFrame(np.random.choice([1,0,np.nan], size=(10, 4)), columns=list('ABCD'))
print (df)
A B C D
0 1.0 0.0 0.0 1.0
1 0.0 NaN NaN 1.0
2 NaN 0.0 0.0 0.0
3 1.0 1.0 NaN NaN
4 NaN NaN 0.0 0.0
5 0.0 NaN 0.0 1.0
6 0.0 NaN NaN 1.0
7 0.0 1.0 NaN NaN
8 1.0 0.0 1.0 0.0
9 0.0 NaN NaN NaN
If need remove columns if both D
and C
has 0
and another columns has NaN
s use DataFrame.all
for test if both values are 0
and chain by &
for bitwise AND
with
DataFrame.any
for test if at least one value is NaN
tested by DataFrame.isna
:
m = df[['D','C']].eq(0).all(axis=1) & df.isna().any(axis=1)
df1 = df[~m]
print (df1)
A B C D
0 1.0 0.0 0.0 1.0
1 0.0 NaN NaN 1.0
3 1.0 1.0 NaN NaN
5 0.0 NaN 0.0 1.0
6 0.0 NaN NaN 1.0
7 0.0 1.0 NaN NaN
8 1.0 0.0 1.0 0.0
9 0.0 NaN NaN NaN
Another alternative without ~
for invert, but all conditions and also &
is changed to |
for bitwise OR
:
m = df[['D','C']].ne(0).any(axis=1) | df.notna().all(axis=1)
df1 = df[m]
print (df1)
A B C D
0 1.0 0.0 0.0 1.0
1 0.0 NaN NaN 1.0
3 1.0 1.0 NaN NaN
5 0.0 NaN 0.0 1.0
6 0.0 NaN NaN 1.0
7 0.0 1.0 NaN NaN
8 1.0 0.0 1.0 0.0
9 0.0 NaN NaN NaN
Upvotes: 1