Reputation: 154
Coming from R, I cannot figure out how to make kinda vectorized operations on one dataframe column by utilizing other columns, e.g.:
import pandas as pd
df = pd.DataFrame({'s':['Big bear eats cat','cute cat sleeps'],'a':['bear','cat']})
Now I just want to replace (other operations could be split) rowwise the occurrence of a in s with ANIMAL so it looks like this:
0 Big ANIMAL eats cat
1 cute ANIMAL sleeps
In R data.table (with vectorized functions) I would just write something like
df[,s:=str_replace(s,a,"ANIMAL")]
I saw I might be able to use apply but that still seemed very complex for such an easy case
Upvotes: 2
Views: 37
Reputation: 154
I found the following solution doing the same as I am used from in R by vectorizing (numpy needed) the str.replace:
import numpy as np
df['s']=np.vectorize(str.replace)(df['s'],df['a'],"ANIMAL")
print(df)
a s
0 bear Big ANIMAL eats cat
1 cat cute ANIMAL sleeps
Upvotes: 1
Reputation: 164673
You can use a list comprehension:
df['s'] = [' '.join([i if i!=a else 'ANIMAL' for i in s.split()]) \
for a, s in zip(df['a'], df['s'])]
print(df)
a s
0 bear Big ANIMAL eats cat
1 cat cute ANIMAL sleeps
Upvotes: 1