How to select rows based off a column entry using regex to filter?

Question

Here is a schematic of the dataframe I'm working with (note, this is a representative example, and is not meant to demonstrate all possible entries in any column):

Name | Screen | Placeholder for other columns

Bill | GHRF (OOC) | text

Bob | GHRF (IC) | text

Sue | IRMS/CIR (OOC) | text

John | GHRF ISOFORMS IRMS CIR (OOC) | text

I am trying to select all the rows that have (OOC) in the Screen column.

Normally, I would filter a dataframe with something like this dfnew = df[df['Column'] == 'Criteria'], but that doesn't work with a regex.

I have also tried dfnew = df[df['Screen'].filter(regex = r'OOC', axis = 0)], which I thought would work, but didn't.

Could someone please explain to me how I can select rows based on a column entry using regex?

What I would like to wind up with, is something like this:

Name | Screen | Placeholder

Bill | GHRF (OOC) | text

SUE | IRMS/CIR (OOC) | text

John | GHRF ISOFORMS IRMS CIR (OOC) | text

cs95 · Accepted Answer

DataFrame.filter filters on the column names, not values. You're looking for str.contains.

dfnew = df[df['Screen'].str.contains(r'$OOC$')]

Or, if you don't need regex, switch it off—

dfnew = df[df['Screen'].str.contains(r'(OOC)', regex=False)]

print(dfnew)
   Name                        Screen
0  Bill                    GHRF (OOC)
2   Sue                IRMS/CIR (OOC)
3  John  GHRF ISOFORMS IRMS CIR (OOC)

If you're planning to do more indexing/assignment on dfnew, I'd recommend instead creating it with

dfnew = df[df['Screen'].str.contains(r'$OOC$')].copy()

To avoid a SettingWithCopyWarning later on.

How to select rows based off a column entry using regex to filter?

Answers (2)

Related Questions