Adnan Toky
Adnan Toky

Reputation: 1943

How to filter rows from pandas data frame where the specific value matches a RegEx

I have a data frame like this.

    Name      Age
0   Mr A      28
1   Mrs B     32
2   Mrs C     30
3   Mr D      34
4   Miss E    23
5   Mr F      37

I want to filter the rows that contains 'Mr' as the title of name and create a new data frame like below.

    Name      Age
0   Mr A      28
1   Mr D      34
2   Mr F      37

I've tried the following method using loop.

import re
rows = []
for i, row in df.iterrows():
if re.search('Mr\s',row['Name']):
    rows.append(row)

new_df = pd.DataFrame(rows)

Though it works fine, but is there any efficient way to do that without using loop?

Upvotes: 2

Views: 166

Answers (2)

oppressionslayer
oppressionslayer

Reputation: 7204

You can try:

df.loc[df['Name'].str.contains(r'Mr ')]                                                                                                                                             

   Name  Age
0  Mr A   28
3  Mr D   34
5  Mr F   37

Upvotes: 1

Henry Yik
Henry Yik

Reputation: 22493

Use str.contains with word boundary \b:

df = pd.DataFrame({"Name":["Mr A","Mrs B","Mrs C","Mr D"]})

print (df[df["Name"].str.contains(r"\bMr\b")])


   Name
0  Mr A
3  Mr D

Upvotes: 1

Related Questions