Reputation: 97
I have a below dataframe, and my requirement is that, if both columns have np.nan
then no change, if either of column has empty value then fill na with 0
value. I wrote this code but why its not working. Please suggest.
import pandas as pd
import numpy as np
data = {'Age': [np.nan, np.nan, 22, np.nan, 50,99],
'Salary': [217, np.nan, 262, 352, 570, np.nan]}
df = pd.DataFrame(data)
print(df)
cond1 = (df['Age'].isnull()) & (df['Salary'].isnull())
if cond1 is False:
df['Age'] = df['Age'].fillna(0)
df['Salary'] = df['Salary'].fillna(0)
print(df)
Upvotes: 1
Views: 2524
Reputation: 28729
Get the rows that are all nulls and use where
to exclude them during the fill:
bools = df.isna().all(axis = 1)
df.where(bools, df.fillna(0))
Age Salary
0 0.0 217.0
1 NaN NaN
2 22.0 262.0
3 0.0 352.0
4 50.0 570.0
5 99.0 0.0
Your if statement won't work because you need to check each row for True or False; cond1
is a series, and cannot be compared correctly to False (it will just return False, which is not entirely true), there can be multiple False and True in the series.
An inefficient way would be to traverse the rows:
for row, index in zip(cond1, df.index):
if not row:
df.loc[index] = df.loc[index].fillna(0)
apart from the inefficiency, you are keeping track of index positions; the pandas options try to abstract the process while being quite efficient, since the looping is in C
Upvotes: 0
Reputation: 1137
tmp=df.loc[(df['Age'].isna() & df['Salary'].isna())]
df.fillna(0,inplace=True)
df.loc[tmp.index]=np.nan
This might be a bit less sophisticated than the other answers but worked for me:
Upvotes: 0
Reputation: 323376
You can just assign it with update
c = ['Age','Salary']
df.update(df.loc[~df[c].isna().all(1),c].fillna(0))
df
Out[341]:
Age Salary
0 0.0 217.0
1 NaN NaN
2 22.0 262.0
3 0.0 352.0
4 50.0 570.0
5 99.0 0.0
Upvotes: 3
Reputation: 79338
c1 = df['Age'].isna()
c2 = df['Salary'].isna()
df[np.c_[c1 & ~c2, ~c1 & c2]]=0
df
Age Salary
0 0.0 217.0
1 NaN NaN
2 22.0 262.0
3 0.0 352.0
4 50.0 570.0
5 99.0 0.0
Upvotes: 2