progster
progster

Reputation: 937

None in if condition, how to handle missing data?

If the value of age is missing I want to create a variable with the value of 1. Instead everything is None in the output of the Value column.

raw_data1 = {'id': [1,2,3,5],
    'age': [0, np.nan, 10, 2]}
df1 = pd.DataFrame(raw_data1, columns = ['id','age'])


def my_test(b):
    if b is None:
        return 1


df1['Value'] = df1.apply(lambda row: my_test(row['age']), axis=1)  

How can implement it? I know that there are several ways, but I would like to focus on the use of a function, (def my_test etc.).

Upvotes: 2

Views: 3018

Answers (4)

Sreeram TP
Sreeram TP

Reputation: 11907

You can use map for this

df1['Value'] = df1['age'].map(lambda x : 1 if np.isnan(x) else np.nan)

If you want to make use of your function, you can use map like this

def my_test(b):
    if np.isnan(b):
        return 1
    else:
        return np.nan

df1['Value'] = df1['age'].map(lambda x : my_test(x))

Upvotes: 0

Raunaq Jain
Raunaq Jain

Reputation: 917

Do this instead,

>>> df1.value = df1.age.isna().astype(int)
>>> df1
    id   age  value
 0   1   0.0      0
 1   2   NaN      1
 2   3  10.0      0
 3   5   2.0      0

Upvotes: 0

Massimo Costa
Massimo Costa

Reputation: 1860

You can use row.get('age') instead of row['age'].

get() returns null if age is not inside the dict

Upvotes: 0

Joe
Joe

Reputation: 12417

If I understood you correctly, you can use:

df1['value'] = np.where(df1['age'].isnull(), 1, '')

Output:

   id   age value
0   1   0.0      
1   2   NaN     1
2   3  10.0      
3   5   2.0      

Upvotes: 3

Related Questions