Nordle
Nordle

Reputation: 2981

How to replace values in a column in pandas using regex and a conditional

Im trying to replace certain values in a pandas column (dataframe) using regex, but I want to apply the regex based on values in another column.

A basic example;

index  col1  col2
1      yes   foobar
2      yes   foo
3      no    foobar

Using the following;

df.loc[df['col1'] == 'yes', 'col2'].replace({r'(fo)o(?!bar)' :r'\1'}, inplace=True, regex=True)

I expected the following result;

index  col1  col2
1      yes   foobar
2      yes   fo
3      no    foobar

However it doesn't seem to be working? It doesn't throw any errors or a settingwithcopy warning, it just does nothing. Is there an alternative way to do this?

Upvotes: 3

Views: 3351

Answers (2)

user3483203
user3483203

Reputation: 51185

Using np.where:

df.assign(
    col2=np.where(df.col1.eq('yes'), df.col2.str.replace(r'(fo)o(?!bar)', r'\1'), df.col2)
)

  col1    col2
1  yes  foobar
2  yes      fo
3   no  foobar

Upvotes: 1

jezrael
jezrael

Reputation: 863791

For avoid chained assignments assign back and remove inplace=True:

mask = df['col1'] == 'yes'
df.loc[mask, 'col2'] = df.loc[mask, 'col2'].replace({r'(fo)o(?!bar)' :r'\1'}, regex=True)

print (df)
  col1    col2
1  yes  foobar
2  yes      fo
3   no  foobar

Upvotes: 4

Related Questions