Apply a function to pandas DataFrame with a condition to check for `NaNs`

Question

I have Null and NaNs in one of the pandas DataFrame columns. I'd like to apply with a condition to check for NaN in the column and store the return from the function into a new column.

import pandas as pd
from numpy import NaN

df = pd.DataFrame({'Col1': [1, 9, NaN],
                   'Col2': [1, 3, 5]}) 

def sample_func(v1, v2, token):    
    # call API 
    r = cl_apicall(v1, v2, token)
    return r

# mock api call
def cl_apicall(v1, v2, token):
    return f"{v1},-{v2}-{token}"

# Apply function
df['new_col'] = df.apply(lambda x: sample_func(x['Col1'], x['Col2'], 'xxxxxx'), axis = 1)

print(df)

Result

   Col1  Col2          new_col
0   1.0     1  1.0,-1.0-xxxxxx
1   9.0     3  9.0,-3.0-xxxxxx
2   NaN     5  nan,-5.0-xxxxxx

How do I write the apply statement for only NaNs or Null in col1 values only? Note, function is simplified for reproducibility.

Expected result:

 Col1  Col2          new_col
0   1.0     1  
1   9.0     3  
2   NaN     5  nan,-5.0-xxxxxx

i.e only .apply the function to rows where Col1 is NaN.

tdelaney · Accepted Answer

You can first filter for the rows you want, apply the function and then assign to the new column. Pandas will fill the missing rows with NaN. This is usually more efficient than running apply for each row.

import pandas as pd
from numpy import NaN

df = pd.DataFrame({'Col1': [1, 9, NaN],
                   'Col2': [1, 3, 5]}) 

def sample_func(v1, v2, token):
    # call API 
    r = cl_apicall(v1, v2, token)
    return r

# mock api call
def cl_apicall(v1, v2, token):
    return f"{v1},-{v2}-{token}"

# Apply function
#df['new_col'] = df.apply(lambda x: sample_func(x['Col1'], x['Col2'], 'xxxxxx'), axis = 1)
df['new_col'] = df[df['Col1'].isnull()].apply(lambda x: sample_func(x['Col1'], x['Col2'], 'xxxxxx'), axis = 1)
print(df)

Result

   Col1  Col2          new_col
0   1.0     1              NaN
1   9.0     3              NaN
2   NaN     5  nan,-5.0-xxxxxx

Apply a function to pandas DataFrame with a condition to check for `NaNs`

Answers (1)

Related Questions