Piyush Ghasiya
Piyush Ghasiya

Reputation: 525

Better way to extract rows with specific string in Python

I have a dataset of more than 10,000 rows and 6 columns (one of the column is "Name"). I want to extract all rows with specific Name.

For example if I want to extract rows with two name I used this code:

import pandas as pd

df = pd.read_csv('Sample.csv') 

df = df[df.Name.str.contains("name_1|name_3")]

df.to_csv("Name_list.csv")

But the problem is that I have hundreds of names for which I want to extract all the data and if I use the above code I have to write (copy/paste) all the names which is time consuming.

Is there a better way to achieve my objective?

Thank you in Advance!

Upvotes: 1

Views: 1361

Answers (2)

BENY
BENY

Reputation: 323306

You can load the name csv

namelist = pd.read_csv('name.csv') 
df = pd.read_csv('Sample.csv') 
df = df[df.Name.str.contains('|'.join(namelist['name'].tolist()))]

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521674

If you want to continue using a regex contains() approach, then you may form an alternation from some input Python list, e.g.

names = ['name_1', 'name_3']  # add more names here if desired
regex = r'(?:' + '|'.join(names) + r')'
df = df[df.Name.str.contains(regex)]

Upvotes: 3

Related Questions