user4718221
user4718221

Reputation: 606

Pandas using apply() to run the function only on part of the dataframe

I am using addresses stored in a pandas dataframe columns as arguments for a function to make a call to Google Maps API and store the results in a column called address_components in the same dataframe

dm.loc[: , 'address_components'] = dm.loc[:, ['streetNumber', 'streetName', 'city']].apply(
    lambda row: get_address(row[0], row[1], row[2]), axis=1)

The entire dataframe is very large and I would like to run the same function on part of the dataframe that fits a specific condition. I have tried this:

dm[dm['g_FSA'] == 'None'].loc[: , 'address_components'] = dm[dm['g_FSA'] == 'None'].loc[:, ['streetNumber', 'streetName', 'city']].apply(
    lambda row: get_address(row[0], row[1], row[2]), axis=1)

But that's not working properly. Could someone help me spot my mistake?

Upvotes: 4

Views: 791

Answers (1)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Create a boolean mask using Series.eq, then use this mask along with DataFrame.loc to select specific rows and columns, then use DataFrame.apply to apply the custom function:

m = dm['g_FSA'].eq('None')
dm.loc[m, 'address_components'] = (
    dm.loc[m, ['streetNumber', 'streetName', 'city']]
    .apply(lambda s: get_address(*s), axis=1)
)

Upvotes: 2

Related Questions