Reputation: 1943
I have a data frame like this.
Name Age
0 Mr A 28
1 Mrs B 32
2 Mrs C 30
3 Mr D 34
4 Miss E 23
5 Mr F 37
I want to filter the rows that contains 'Mr' as the title of name and create a new data frame like below.
Name Age
0 Mr A 28
1 Mr D 34
2 Mr F 37
I've tried the following method using loop.
import re
rows = []
for i, row in df.iterrows():
if re.search('Mr\s',row['Name']):
rows.append(row)
new_df = pd.DataFrame(rows)
Though it works fine, but is there any efficient way to do that without using loop?
Upvotes: 2
Views: 166
Reputation: 7204
You can try:
df.loc[df['Name'].str.contains(r'Mr ')]
Name Age
0 Mr A 28
3 Mr D 34
5 Mr F 37
Upvotes: 1
Reputation: 22493
Use str.contains
with word boundary \b
:
df = pd.DataFrame({"Name":["Mr A","Mrs B","Mrs C","Mr D"]})
print (df[df["Name"].str.contains(r"\bMr\b")])
Name
0 Mr A
3 Mr D
Upvotes: 1