Python: Pandas Dataframe Using Wildcard to Find String in Column and Keep Row

Question

I have a pandas data frame. Below is a sample table.

Event   Text
A       something/AWAIT hello          
B       la de la
C       AWAITING SHIP
D       yes NO AWAIT

I want to only keep rows that contain some form of the word AWAIT in the Text column. Below is my desired table:

Event   Text
A       something/AWAIT hello          
C       AWAITING SHIP
D       yes NO AWAIT

Below is the code I tried to capture strings that contain AWAIT in all possible circumstances.

df_STH001_2 = df_STH001[df_STH001['Text'].str.contains("?AWAIT?") == True]

The error I get is as follows:

error: nothing to repeat at position 0

MaxU - stand with Ukraine · Accepted Answer

Series.str.contains(pat, case=True, flags=0, na=nan, regex=True) per default treats pat as a RegEx.

The question mark (?) makes the preceding token in the regular expression optional, hence the error message.

In [178]: d[d['Text'].str.contains('AWAIT')]
Out[178]:
  Event                   Text
0     A  something/AWAIT hello
2     C          AWAITING SHIP
3     D           yes NO AWAIT

Python: Pandas Dataframe Using Wildcard to Find String in Column and Keep Row

Answers (2)

Related Questions