Macher
Macher

Reputation: 3

How can I find a string based on a specific word, between two specific words?

New to Regex, please help!

Example String:

START

  blahblah
  blahblah blahblah
  blahblahblahblah

  blahblah KEYWORD blah

  blahblah
  blah

END

Problem: I would like to locate the entire string (between START and END) containing a certain KEYWORD.

Context: I have a large file with multiple iterations of the multi-line START*END example string and need to sort these strings based on the KEYWORD they contain. Each string contains the same START and END, but a different KEYWORD.

What I have so far:

START\s[\s\S]*?(?=END\s|\Z)    returns the entire string, but is not specific to a KEYWORD

Not sure how to go about finding the entire string based on the KEYWORD.

Any help would be appreciated.

Thanks!

Upvotes: 0

Views: 239

Answers (1)

Amadan
Amadan

Reputation: 198314

(?s)(?<=START)(?:(?!END).)*?(?:KEYWORD1|KEYWORD2)(?:.*?)(?=END)

(regex101) Firstly - we consider a newline as "any character". We start just after START, and end just before END. In between, we want as low number of any characters that don't start the string END as possible, followed by KEYWORD1 or KEYWORD2, followed by as low number of any characters as possible.

This is based on the assumption that you have a finite list of keywords. If, on the other hand, keywords are identified by some other means, then you should Michael Butscher's comment first.

Upvotes: 2

Related Questions