How to use apply attribute on tuple

Question

I tried to change all NaN elements in column b to 1 if column a is not NaN in the same row. eg: a==1 b==NaN ,change b to 1. Here is my code.

raw_data['b'] = ((raw_data['a'],raw_data['b']).apply(condition))

def condition(a,b):
if a != None and b == None:
    return 1

And I got an AttributeError: 'tuple' object has no attribute 'apply'. What other methods I can use in this situation?

jezrael · Accepted Answer

First create boolean mask by chained conditions with & with functions isnull and notnull.

Then is more possible solutions for add 1 - with mask, loc or numpy.where:

mask = raw_data['a'].notnull() & raw_data['b'].isnull()

raw_data['b'] = raw_data['b'].mask(mask, 1)

Or:

raw_data.loc[mask, 'b'] = 1

Or:

raw_data['b'] = np.where(mask, 1,raw_data['b'])

Sample:

raw_data = pd.DataFrame({
    'a': [1,np.nan, np.nan],
    'b': [np.nan, np.nan,2]
})
print (raw_data)
     a    b
0  1.0  NaN
1  NaN  NaN
2  NaN  2.0

mask = raw_data['a'].notnull() & raw_data['b'].isnull()

print (mask)
0     True
1    False
2    False
dtype: bool

raw_data.loc[mask, 'b'] = 1

print (raw_data)
     a    b
0  1.0  1.0
1  NaN  NaN
2  NaN  2.0

EDIT:

If want use custom function (really slow if more data) need apply with axis=1 for processing by rows:

def condition(x):
    if pd.notnull(x.a) and pd.isnull(x.b):
        return 1
    else:
        return x.b

raw_data['b'] = raw_data.apply(condition, axis=1)

print (raw_data)
     a    b
0  1.0  1.0
1  NaN  NaN
2  NaN  2.0

How to use apply attribute on tuple

Answers (1)

Related Questions