sachinruk
sachinruk

Reputation: 9869

Match words in the middle

Suppose that I want to match the first two texts but not the third.

import pandas as pd

test_text = [
"Command to remove :: blah Reason for removal:",
"Command to be removed :: Command (<NAME>) Reason for removal:",
"Command to <RANDOM-WORD> removed :: Command (<NAME>) Reason for removal:"
]

df = pd.DataFrame({"text": test_text})
df["text"].str.contains(my_regex) # REQUIRED OUTPUT: True, True, False

The only thing I can think of is my_regex = r"Command to (be)? remove". However, this is not matching with sentences containing "be". What's the correct way of doing this.

Upvotes: 0

Views: 63

Answers (2)

Hasnat
Hasnat

Reputation: 591

You have to make white-space before or after "be" optional. If you have variation of "removed" and "remove" in sentences make "d" in removed optional as well.

r"Command to\s?(be)? removed?"

Upvotes: 1

Fran Na Jaya
Fran Na Jaya

Reputation: 388

You can use

my_regex = r"Command to (([b][e] )|)remove"

Upvotes: 1

Related Questions