Kallol
Kallol

Reputation: 2189

How to remove a column string value from another column string value?

I have a dataframe like this

df:
col1                      col2
blue water bottle        blue
red wine glass           red
green cup                green 

I want make another column which will ignore the value of col2 from col1 for example the new column col3 will be:

water bottle
wine glass
green cup

I have tried this code:

df.apply(lambda x: x['col1'].replace(x['col2'], ''), axis=1)

But I am getting following error:

AttributeError: ("'NoneType' object has no attribute 'replace'", 'occurred at index 0')

How to do it ?

Upvotes: 2

Views: 136

Answers (4)

gosuto
gosuto

Reputation: 5741

Drop rows with NaN entries before apply your lambda: df[['col1', 'col2']].dropna().apply(lambda x: x['col1'].replace(x['col2'], ''), axis=1)

Upvotes: 0

Sede
Sede

Reputation: 61273

The reason is that the "col1" for some rows in your dataframe are None. You will need to handle those cases for example by assigning an empty string to col3

df["col3"] = df.apply(
    lambda x: "" if pd.isnull(x["col1"]) else x["col1"].replace(x["col2"], ""),
    axis=1
)

Upvotes: 2

pathankhan.salman
pathankhan.salman

Reputation: 54

This is one way (vectorizing would give a better answer of course)

import pandas as pd

df = pd.DataFrame()
df['col'] = ['blue water bottle', 'red wine glass', 'green cup']
df['col2'] = ['blue', 'red', 'green']
df['col3'] = ['', '', '']
for idx, row in df.iterrows():
    row['col3'] = row['col'].replace(row['col2'], '').strip()

pandas string replace

Upvotes: -1

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

Use -

df[['col','col2']].apply(lambda x: x[0].replace(x[1],''), axis=1)

Output

0     water bottle
1       wine glass
2              cup
dtype: object

Upvotes: 1

Related Questions