Reputation: 5234
I have a pandas data frame. Below is a sample table.
Event Text
A something/AWAIT hello
B la de la
C AWAITING SHIP
D yes NO AWAIT
I want to only keep rows that contain some form of the word AWAIT in the Text column. Below is my desired table:
Event Text
A something/AWAIT hello
C AWAITING SHIP
D yes NO AWAIT
Below is the code I tried to capture strings that contain AWAIT in all possible circumstances.
df_STH001_2 = df_STH001[df_STH001['Text'].str.contains("?AWAIT?") == True]
The error I get is as follows:
error: nothing to repeat at position 0
Upvotes: 8
Views: 20504
Reputation: 2529
You can also try the match
method:
df[df.column.str.match('some_string')]
Upvotes: 1
Reputation: 210882
Series.str.contains(pat, case=True, flags=0, na=nan, regex=True) per default treats pat
as a RegEx.
The question mark (?
) makes the preceding token in the regular expression optional, hence the error message.
In [178]: d[d['Text'].str.contains('AWAIT')]
Out[178]:
Event Text
0 A something/AWAIT hello
2 C AWAITING SHIP
3 D yes NO AWAIT
Upvotes: 8