hiimarksman
hiimarksman

Reputation: 301

How to filter a pandas column by list of strings?

The standard code for filtering through pandas would be something like:

output = df['Column'].str.contains('string')
strings = ['string 1', 'string 2', 'string 3']

Instead of 'string' though, I want to filter such that it goes through a collection of strings in list, "strings". So I tried something such as

output = df['Column'].str.contains('*strings')

This is the closest solution I could find, but did not work How to filter pandas DataFrame with a list of strings

Edit: I should note that I'm aware of the | or operator. However, I'm wondering how to tackle all cases in the instance list strings is changing and I'm looping through varying lists of changing lengths as the end goal.

Upvotes: 5

Views: 13735

Answers (2)

Mutaz-MSFT
Mutaz-MSFT

Reputation: 806

you probably should look into using isin() function (pandas.Series.isin) .

check the code below:

    df = pd.DataFrame({'Column':['string 1', 'string 1', 'string 2', 'string 2', 'string 3', 'string 4', 'string 5']})
    strings = ['string 1', 'string 2', 'string 3']
    output = df.Column.isin(strings)

    df[output]

output:

        Column
    0   string 1
    1   string 1
    2   string 2
    3   string 2
    4   string 3

Upvotes: 6

Victor Hugo Borges
Victor Hugo Borges

Reputation: 413

You can create a regex string and search using this string.

Like this: df['Column'].str.contains('|'.join(strings),regex=True)

Upvotes: 8

Related Questions