Pythonic
Pythonic

Reputation: 2131

Pandas - Select rows of a dataframe that contains a certain regex in ANY column

Good morning

given a dataframe that contains text data such as:

df = pandas.DataFrame({
    'a':['first', 'second', 'third'], 
    'b':['null', 'third', 'first']})

I can select rows that contain the word 'first' by:

df.a.str.contains('first') | df.b.str.contains('first')

which would yield

0     True
1    False
2     True
dtype: bool

To apply the same condition to to dozens of columns I could use isin, but it seems not to work if I need to substitute 'first' with a regex, as in regex = '(?=.*first)(?=.*second)'.

Is there a more pythonic and elegant way to select on multiple columns, rather than just concatenating several single-column df.<column_name>.str.contains(regex) conditions with | in the code? Thanks

Upvotes: 1

Views: 2600

Answers (1)

fixxxer
fixxxer

Reputation: 16144

Why don't we use applymap on the entire data frame. This will be different than working the columns but would make it easier for your to apply if-else conditions to (I hope):

In [62]: l = ['first', 'second']

In [63]: df
Out[63]: 
        a      b
0   first   null
1  second  third
2   third  first

In [64]: df.appl
df.apply     df.applymap  

In [64]: df.applymap(lambda v: True if v in l else False)
Out[64]: 
       a      b
0   True  False
1   True  False
2  False   True

Update:

(Thanks to @Pythonic for this update)

We can supply regex in applymap like so:

regex = '(^fi)'
df.applymap(lambda v: bool(re.search(regex, v)))
## -- End pasted text --
Out[38]: 
       a      b
0   True  False
1  False  False
2  False   True

Following example is with re.flags enabled:

In [44]: df = pandas.DataFrame({
   ....:     'a':['First', 'second', 'NULL'], 
   ....:     'b':['null', 'third', 'first']})

In [45]: regex = re.compile('(^fi)', flags=re.IGNORECASE)

In [46]: df.applymap(lambda v: bool(re.search(regex_ignore_case, v)))
Out[46]: 
       a      b
0   True  False
1  False  False
2  False   True

Upvotes: 2

Related Questions