Reputation: 2981
Im trying to replace certain values in a pandas column (dataframe) using regex, but I want to apply the regex based on values in another column.
A basic example;
index col1 col2
1 yes foobar
2 yes foo
3 no foobar
Using the following;
df.loc[df['col1'] == 'yes', 'col2'].replace({r'(fo)o(?!bar)' :r'\1'}, inplace=True, regex=True)
I expected the following result;
index col1 col2
1 yes foobar
2 yes fo
3 no foobar
However it doesn't seem to be working? It doesn't throw any errors or a settingwithcopy
warning, it just does nothing. Is there an alternative way to do this?
Upvotes: 3
Views: 3351
Reputation: 51185
Using np.where
:
df.assign(
col2=np.where(df.col1.eq('yes'), df.col2.str.replace(r'(fo)o(?!bar)', r'\1'), df.col2)
)
col1 col2
1 yes foobar
2 yes fo
3 no foobar
Upvotes: 1
Reputation: 863791
For avoid chained assignments assign back and remove inplace=True
:
mask = df['col1'] == 'yes'
df.loc[mask, 'col2'] = df.loc[mask, 'col2'].replace({r'(fo)o(?!bar)' :r'\1'}, regex=True)
print (df)
col1 col2
1 yes foobar
2 yes fo
3 no foobar
Upvotes: 4