John Could
John Could

Reputation: 165

Regex: include 3 word in front and 3 behind the selected text

Im using this regex code in excel to find the desired text in a paragraph:

=RegexExtract(B2,"(bot|vehicle|scrape)")

This code will successfully return all 3 of the words if they are found on a paragraph, what I would like to do as an extra is for the regex to return the desired text in bold along with few words in front and 3 words in the back of the selected word.

Example of text:

A car (or automobile) is a wheeled motor vehicle used for transportation. 
Most definitions of car say they run primarily on roads, seat one to eight people,
have four tires, and mainly transport people rather than goods.

Example output:

a wheeled motor **vehicle** used for transportation

I want a portion of the text to appear in order for the receiver to be able to pinpoint easier the location of the text.

Any alternative approach is much appreciated.

Upvotes: 2

Views: 200

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626851

You may use

=RegexExtract(B2,"(?:\w+\W+(?:\w+\W+){0,2})?(?:bot|vehicle|scrape)(?:\W+\w+(?:\W+\w+){0,2})?")

See the regex demo and the Regulex graph:

enter image description here

Details: The pattern is enclosed with capturing parentheses to make REGEXEXTRACT actually extract the string you need that meets the following pattern:

  • (?:\w+\W+(?:\w+\W+){0,2})? - an optional sequence of a word followed with non-word chars that is followed with zero, one or two repetitions of 1+ word chars and then 1+ non-word chars
  • (?:bot|vehicle|scrape) - a bot, vehicle or scrape word
  • (?:\W+\w+(?:\W+\w+){0,2})? - an optional sequence of 1+ non-word chars and then 1+ word chars followed with zero, one or two repetitions of 1+ non-word chars and then 1+ word chars.

Google Spreadsheets test:

enter image description here

Upvotes: 2

Related Questions