Reputation: 27
I am trying to write a python script that can take a dataframe of an arbitrary size and filter it for rows where any column matches some criteria, specifically looking for a text match. I show a 2x4 table to represent an example, but my data could have several thousand rows and anywhere from 1-15 columns.
A header | Another header |
---|---|
Event1 | Event2 |
Event2 | OPER-8 |
OPER-1 | Event2 |
Event3 | Event4 |
I want to basically filter the dataframe to give me the rows where any entry matches the phrase "OPER*" or maybe where "Event2" is present. Most of the ways I've found searching online require you to specify a column name/number, but I want to search the entire thing. I know I can do a for loop, but are there any ways to do this more efficiently with pandas?
Upvotes: 0
Views: 34
Reputation: 18416
You can apply a function on row axis of the dataframe to create True/False value for the provided condition, then take out the values based on this True/False value maksing
>>> df[df.apply(lambda x: x.str.match('^OPER-\d+|Event2$').any(), axis=1)]
A header Another header
0 Event1 Event2
1 Event2 OPER-8
2 OPER-1 Event2
Upvotes: 2