user1775614
user1775614

Reputation: 451

How would I select rows in pandas that match a list of strings, not just one particular string?

Let's say we have a dataframe- df and a column labelled 'A'. For selecting rows that match ONE string -'some_string', df['A'].str.contains('some_string') works great.

My question is, is there a corresponding method to pass to contains a list of strings, so that partial matches can be gotten? instead of 'some_string' can I give it a list of strings? I am trying to avoid using a for loop and slicing the data frame and concatenating into a new dataframe.

Lets say the dataframe is

pd.DataFrame(np.array([['cat', 2], ['rat', 5], ['ball', 8],['string', 8]]),columns=['A', 'B']))

and

list =['at','ll','ac']

So I want to select the rows with cat, rat, ball. Sorry for the artificially contrived example.

Upvotes: 3

Views: 7831

Answers (3)

danz
danz

Reputation: 83

The most simplistic and Pandas friendly is:

list_of_strings = ['string1', 'string2']

df[df['A'].isin(list_of_strings)] 

From https://sparkbyexamples.com/pandas/pandas-use-a-list-of-values-to-select-rows-from-dataframe/

Upvotes: 2

Graipher
Graipher

Reputation: 7186

pandas.Series.str.contains takes either a string or a regex. So you could just build a regex from the list of strings:

import pandas as pd

strings = "fo", "ba"
x = pd.Series(["foo", "bar", "baz", "buzz"])
x.str.contains("|".join(strings))
# 0     True
# 1     True
# 2     True
 #3    False
# dtype: bool

This might be slow if your list of strings to match against is very long and you might need a na=False to ignore NaN values, as mentioned in the comments by @anky_91.

Upvotes: 5

Tacratis
Tacratis

Reputation: 1055

If A always contains exactly the string you want to find in the list, you can do this:

df['A'].map(lambda x: 1 if x in list_of_strings else 0) 

the lambda function will check, for each row, if the value in 'A' (temporarily stored in x exists as one of the elements in list_of_strings, and return 1 or 0 accordingly.

You can then filter when this new mapped column is 1

Upvotes: 0

Related Questions