Reputation: 2189
I have a data frame like this:
df
col1 col2 col3
ab 1 prab
cd 2 cdff
ef 3 eef
I want to remove col1 values from the col3 values
the final data frame should look like<
df
col1 col2 col3
ab 1 pr
cd 2 ff
ef 3 e
How to do it using pandas in most effective way ?
Upvotes: 0
Views: 1783
Reputation: 150735
It looks like a loop is unavoidable since you have to work with replacing/removing substrings. In that case, list comprehension might come in handy:
%%timeit
df.apply(lambda x: x['col3'].replace(x['col1'], ''), axis=1)
# 767 µs ± 24.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
while
%%timeit
[a.replace(b,'') for a,b in zip(df['col3'], df['col1'])]
# 24.4 µs ± 3.18 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Upvotes: 1
Reputation: 42886
Use .apply
with replace
over axis=1
:
df['col3'] = df.apply(lambda x: x['col3'].replace(x['col1'], ''), axis=1)
Output
col1 col2 col3
0 ab 1 pr
1 cd 2 ff
2 ef 3 e
Upvotes: 2
Reputation: 2265
Suppose df is a matrix :
df = [["ab",1,"prab"],["cd",2,"cdff"],["ef",3,"eef"]]
You want to remove the key (col1) in each value (col3) for each row :
for row in df:
row[2] = row[2].replace(row[0],"")
Following this documentation each occurence of col1 is replaced by an empty string: "".
Upvotes: 0