Operate on columns based on other column contents in pandas

Question

Coming from R, I cannot figure out how to make kinda vectorized operations on one dataframe column by utilizing other columns, e.g.:

import pandas as pd
df = pd.DataFrame({'s':['Big bear eats cat','cute cat sleeps'],'a':['bear','cat']})

Now I just want to replace (other operations could be split) rowwise the occurrence of a in s with ANIMAL so it looks like this:

0    Big ANIMAL eats cat
1    cute ANIMAL sleeps

In R data.table (with vectorized functions) I would just write something like

df[,s:=str_replace(s,a,"ANIMAL")]

I saw I might be able to use apply but that still seemed very complex for such an easy case

katsumi · Accepted Answer

I found the following solution doing the same as I am used from in R by vectorizing (numpy needed) the str.replace:

import numpy as np

df['s']=np.vectorize(str.replace)(df['s'],df['a'],"ANIMAL")

print(df)
      a                    s
0  bear  Big ANIMAL eats cat
1   cat   cute ANIMAL sleeps

Operate on columns based on other column contents in pandas

Answers (2)

Related Questions