Reputation: 483
I have a pandas dataframe as below:
+------+---+
|Name |age|
+------+---+
|Mona |12 |
+------+---+
|Monika|25 |
+------+---+
|Tomas |3 |
+------+---+
|Ilyas |47 |
+------+---+
Now, I want to assign some values in a new column to the Name
column contains some substring. For example if it contains Mon then we assign text
to them and if it contains as
we assign city
to them.
So the output must be:
+------+---+----+
|Name |age|new |
+------+---+----+
|Mona |12 |text|
+------+---+----+
|Monika|25 |text|
+------+---+----+
|Tomas |3 |city|
+------+---+----|
|Ilyas |47 |city|
+------+---+----+
I wrote the following codes and it didn't work:
df['new'] = np.nan
df['new'] = df['new'].mask( 'Mon' in df['Name'], 'text')
df['new'] = df['new'].mask( 'as' in df['Name'], 'city')
Upvotes: 0
Views: 697
Reputation: 21
Df.loc[df.name.str.endswith('as'), 'new'] = 'city'
Df.loc[df.name.str.startswith('Mon'), 'new'] = 'text'
Str has a method startsWith and endsWith for that kind of problems. There's probably the way to do it in a single line but right now nothing comes to my mind.
Loc is always helpful with selecting and updating data on condition.
Edit: Juan C is right it should be 'contains' not 'startsWith', my bad.
Upvotes: 0
Reputation: 14216
I think this would be your best bet, since it will handle case-insensitive matches and fill NaN
where neither condition is met.
import numpy as np
Name Age
0 Mona 12
1 Monika 25
2 Tomas 3
3 Ilyas 47
default = np.where(df.Name.str.contains('as', case=False), 'city', np.nan)
cond = np.where(df.Name.str.contains('Mon', case=False), 'text', default)
df.assign(new=cond)
Name Age new
0 Mona 12 text
1 Monika 25 text
2 Tomas 3 city
3 Ilyas 47 city
Upvotes: 0
Reputation: 5460
If your logic is going to get complicated, you may want to encapsulate your mapping into a function:
def map_func(name):
if 'Mon' in name:
return 'text'
elif 'as' in name:
return 'city'
df['Name'].map(map_func)
Result:
0 text
1 text
2 city
3 city
Name: Name, dtype: object
Upvotes: 1
Reputation: 6132
This should do:
df['new']=np.where(df.Name.str.contains('Mon'), 'text', 'city')
np.where
works like Excel's IF
: if the condition is met, fill with 'text', else fill with 'city'.
Upvotes: 0