Elham
Elham

Reputation: 867

Working with NaN values in multiple columns in Pandas

I have multiple datasets with different number of rows and same number of columns. I would like to find Nan values in each column for example consider these two datasets:

dataset1 :            dataset2:
a  b                  a    b
1  10                 2    11
2  9                  3    12
3  8                  4    13
4  nan                nan  14
5  nan                nan  15
6  nan                nan  16

I want to find nan values in two datasets a and b : if it occurs in column b then remove all the rows that have nan values. and if it occurs in column a then fill that values with 0.

this is my snippet code:

a=pd.notnull(data['a'].values.any())
b= pd.notnull((data['b'].values.any()))
if a:
     data = data.dropna(subset=['a'])
if b:
     data[['a']] = data[['a']].fillna(value=0)

which does not work properly.

Upvotes: 3

Views: 2246

Answers (2)

BENY
BENY

Reputation: 323226

Pass your condition to a dict

df=df.fillna({'a':0,'b':np.nan}).dropna()

You do not need 'b' here

df=df.fillna({'a':0}).dropna()

EDIT :

df.fillna({'a':0}).dropna()
Out[1319]: 
     a   b
0  2.0  11
1  3.0  12
2  4.0  13
3  0.0  14
4  0.0  15
5  0.0  16

Upvotes: 2

Vaishali
Vaishali

Reputation: 38415

You just need fillna and dropna without control flow

data = data.dropna(subset=['b']).fillna(0)

Upvotes: 4

Related Questions