Erin
Erin

Reputation: 505

replacing values in df based on condition with np.select and np.where

I have a list of data_terms, and I'd like to loop through the col of df to replace all instances matching any item in data_terms with the word "database". I've tried with np.select:

discrete_distributed_bar["profile_standardized"] = np.select(
    [discrete_distributed_bar["profile_standardized"].isin(data_terms)],
    ["database"],
    default=discrete_distributed_bar.profile_standardized,
)

and, with np.where:

for index, row in discrete_distributed_bar["profile_standardized"].items():
    np.where(
        discrete_distributed_bar["profile_standardized"].isin(data_terms),
        "database",
        row,
    )

but the replacement is not actually happening. What am I missing here?

Thanks for the help!

Upvotes: 1

Views: 213

Answers (1)

jezrael
jezrael

Reputation: 862591

Here seems solution should be simplify:

discrete_distributed_bar["profile_standardized"] = discrete_distributed_bar["profile_standardized"].replace(data_terms, 'database')

But I think there is some problem with data (e.g. whitespaces), test it by:

print (discrete_distributed_bar["profile_standardized"].isin(data_terms))

Then is possible use:

discrete_distributed_bar["profile_standardized"] = discrete_distributed_bar["profile_standardized"].str.strip().replace(data_terms, 'database')

Upvotes: 1

Related Questions