PineNuts0
PineNuts0

Reputation: 5234

Python: Pandas Dataframe Using Wildcard to Find String in Column and Keep Row

I have a pandas data frame. Below is a sample table.

Event   Text
A       something/AWAIT hello          
B       la de la
C       AWAITING SHIP
D       yes NO AWAIT 

I want to only keep rows that contain some form of the word AWAIT in the Text column. Below is my desired table:

Event   Text
A       something/AWAIT hello          
C       AWAITING SHIP
D       yes NO AWAIT 

Below is the code I tried to capture strings that contain AWAIT in all possible circumstances.

df_STH001_2 = df_STH001[df_STH001['Text'].str.contains("?AWAIT?") == True]

The error I get is as follows:

error: nothing to repeat at position 0

Upvotes: 8

Views: 20504

Answers (2)

Amir F
Amir F

Reputation: 2529

You can also try the match method:

df[df.column.str.match('some_string')]

Upvotes: 1

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210882

Series.str.contains(pat, case=True, flags=0, na=nan, regex=True) per default treats pat as a RegEx.

The question mark (?) makes the preceding token in the regular expression optional, hence the error message.

In [178]: d[d['Text'].str.contains('AWAIT')]
Out[178]:
  Event                   Text
0     A  something/AWAIT hello
2     C          AWAITING SHIP
3     D           yes NO AWAIT

Upvotes: 8

Related Questions