Reputation: 45
I tried to change all NaN elements in column b to 1 if column a is not NaN in the same row. eg: a==1 b==NaN ,change b to 1. Here is my code.
raw_data['b'] = ((raw_data['a'],raw_data['b']).apply(condition))
def condition(a,b):
if a != None and b == None:
return 1
And I got an AttributeError: 'tuple' object has no attribute 'apply'. What other methods I can use in this situation?
Upvotes: 2
Views: 445
Reputation: 862701
First create boolean mask by chained conditions with &
with functions isnull
and notnull
.
Then is more possible solutions for add 1
- with mask
, loc
or numpy.where
:
mask = raw_data['a'].notnull() & raw_data['b'].isnull()
raw_data['b'] = raw_data['b'].mask(mask, 1)
Or:
raw_data.loc[mask, 'b'] = 1
Or:
raw_data['b'] = np.where(mask, 1,raw_data['b'])
Sample:
raw_data = pd.DataFrame({
'a': [1,np.nan, np.nan],
'b': [np.nan, np.nan,2]
})
print (raw_data)
a b
0 1.0 NaN
1 NaN NaN
2 NaN 2.0
mask = raw_data['a'].notnull() & raw_data['b'].isnull()
print (mask)
0 True
1 False
2 False
dtype: bool
raw_data.loc[mask, 'b'] = 1
print (raw_data)
a b
0 1.0 1.0
1 NaN NaN
2 NaN 2.0
EDIT:
If want use custom function (really slow if more data) need apply
with axis=1
for processing by rows:
def condition(x):
if pd.notnull(x.a) and pd.isnull(x.b):
return 1
else:
return x.b
raw_data['b'] = raw_data.apply(condition, axis=1)
print (raw_data)
a b
0 1.0 1.0
1 NaN NaN
2 NaN 2.0
Upvotes: 3