nobodyAskedYouPatrice
nobodyAskedYouPatrice

Reputation: 131

using a regex pattern to filter rows from a pandas dataframe

Suppose I have a pandas dataframe like this:

         Word      Ratings
   0     TLYSFFPK  1
   1     SVLENFVGR 2
   2     SVFNHAIRK 3
   3     KAGEVFIHK 4

How can I use regex in pandas to filter out the rows that have the word that match the following regex pattern but keep the dataframe formatting? The regex pattern is: \b.[VIFY][MLFYIA]\w+[LIYVF].[KR]\b

Expected output:

         Word    Ratings
   1     SVLENFVGR 2
   2     SVFNHAIRK 3

Upvotes: 3

Views: 11549

Answers (1)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210832

Demo:

In [2]: df
Out[2]:
        Word  Ratings
0   TLYSFFPK        1
1  SVLENFVGR        2
2  SVFNHAIRH        3
3  KAGEVFIHK        4

In [3]: pat = r'\b.[VIFY][MLFYIA]\w+[LIYVF].[KR]\b'

In [4]: df.Word.str.contains(pat)
Out[4]:
0    False
1     True
2    False
3    False
Name: Word, dtype: bool

In [5]: df[df.Word.str.contains(pat)]
Out[5]:
        Word  Ratings
1  SVLENFVGR        2

Upvotes: 12

Related Questions