infiniteloop
infiniteloop

Reputation: 2212

Calling pandas apply function on results of a mask

I want to apply a function to a subset of rows in my dataframe based on some condition described in a mask. E.g:

mask = (n.city=='No City Found')
n[mask].city = n[mask].address.apply(lambda x: find_city(x))

When I do this, pandas warns me that I'm trying to set a value on a copy of a Dataframe slice. When I inspect the Dataframe, I see that my changes have not persisted.

If I create a new Dataframe slice x using mask and apply the function to x, the results of the apply function are correctly stored in x.

x = n[mask]
x.city = x.address.apply(lambda x: find_city(x))

Is there a way to map this data back to my original Dataframe such that it only affects rows that meet the conditions described in my original mask?

Or is there an easier way altogether to perform such an operation?

Upvotes: 4

Views: 3472

Answers (1)

jcaliz
jcaliz

Reputation: 4021

The right way to update values is using loc

n.loc[mask, 'city'] = n[mask].address.apply(lambda x: find_city(x))

You can also do it without the mask, in case you want to save the memory of the variable

n['city']=n.address.apply(
    lambda x: find_city(x)
    if x.city == 'No City Found' else x.city, axis=1
)

Upvotes: 8

Related Questions