RightmireM
RightmireM

Reputation: 2492

Regex capture phrase, plus word before and word after

With Python re, I'm trying to capture a phrase, plus the word before and word after in a single return.

I.e. From the sentence...
We want to see this phrase here and then again!
Return
see this phrase here

The closest I have gotten is ...

>>> s = 'We want to see this phrase here and then again!'
>>> re.search("\w*\sthis phrase\w*\s",s)
<_sre.SRE_Match object; span=(11, 27), match='see this phrase '>

Upvotes: 1

Views: 623

Answers (2)

anubhava
anubhava

Reputation: 784998

In your regex since you're matching \w*\s after search term it is matching 0 words and a single whitespace after your search term.

You may use this more robust regex to handle cases when your search term is at the start of a line or at the end of a line:

(?:^|\w+\s+)this phrase(?:\s+\w+|$)

RegEx Demo

RegEx Details:

  • (?:^|\w+\s+): Match start position or a word followed by 1+ whitespaces
  • this phrase: Match search term
  • (?:\s+\w+|$): Match end position or 1+ whitespace followed by a word

Upvotes: 3

tripleee
tripleee

Reputation: 189327

This looks like a simple typo. Your attempt looks for this phrase immediately followed by more word characters (so phrases or phraseology too) followed by a space, but you say you want them in the opposite order.

"\w*\sthis phrase\s\w*"

This still won't work correctly for "This phrase has no space before it." or "However, this phrase, unfortunately, is bracketed by punctuation." so it might still need more work on the design if you want to process free-form text.

Upvotes: 2

Related Questions