Reputation: 2024
I have Null
and NaNs
in one of the pandas DataFrame columns. I'd like to apply
with a condition to check for NaN
in the column and store the return from the function into a new column.
import pandas as pd
from numpy import NaN
df = pd.DataFrame({'Col1': [1, 9, NaN],
'Col2': [1, 3, 5]})
def sample_func(v1, v2, token):
# call API
r = cl_apicall(v1, v2, token)
return r
# mock api call
def cl_apicall(v1, v2, token):
return f"{v1},-{v2}-{token}"
# Apply function
df['new_col'] = df.apply(lambda x: sample_func(x['Col1'], x['Col2'], 'xxxxxx'), axis = 1)
print(df)
Result
Col1 Col2 new_col
0 1.0 1 1.0,-1.0-xxxxxx
1 9.0 3 9.0,-3.0-xxxxxx
2 NaN 5 nan,-5.0-xxxxxx
How do I write the apply
statement for only NaNs
or Null
in col1
values only? Note, function is simplified for reproducibility.
Expected result:
Col1 Col2 new_col
0 1.0 1
1 9.0 3
2 NaN 5 nan,-5.0-xxxxxx
i.e only .apply
the function to rows where Col1
is NaN
.
Upvotes: 0
Views: 733
Reputation: 77347
You can first filter for the rows you want, apply the function and then assign to the new column. Pandas will fill the missing rows with NaN
. This is usually more efficient than running apply for each row.
import pandas as pd
from numpy import NaN
df = pd.DataFrame({'Col1': [1, 9, NaN],
'Col2': [1, 3, 5]})
def sample_func(v1, v2, token):
# call API
r = cl_apicall(v1, v2, token)
return r
# mock api call
def cl_apicall(v1, v2, token):
return f"{v1},-{v2}-{token}"
# Apply function
#df['new_col'] = df.apply(lambda x: sample_func(x['Col1'], x['Col2'], 'xxxxxx'), axis = 1)
df['new_col'] = df[df['Col1'].isnull()].apply(lambda x: sample_func(x['Col1'], x['Col2'], 'xxxxxx'), axis = 1)
print(df)
Result
Col1 Col2 new_col
0 1.0 1 NaN
1 9.0 3 NaN
2 NaN 5 nan,-5.0-xxxxxx
Upvotes: 2