britter
britter

Reputation: 151

use regex to find a block of text that does not contains a string

I have a large file that I need to find when a tag is absent from certain blocks. For example-

begin
test
stuff here
1234
end

begin
other stuff
key
end

I would like to find each begin-end section that does not contain the key field. So in this example I would match on the first begin-end section but not the second.

I was able to match each section using begin(.|\n)+?end but I couldn't figure out how to only match the sections without the key in it. I was reading about backreferences but I couldn't figure out how to use those in this situation either.

Upvotes: 1

Views: 1440

Answers (1)

The fourth bird
The fourth bird

Reputation: 163427

If a negative lookahead is supported, you could check the lines following begin do not match end or key.

^begin\b.*(?:\r?\n(?!end|.*\bkey\b).*)*\r?\nend$
  • ^ Start of line
  • begin\b.* Match begin, then any char except newline 0+ times
  • (?: Non capturing group
    • \r?\n(?!end|.*\bkey\b).* Match line that does not start with end or contains key
  • )* Close non capturing group and repeat 0+ times
  • \r?\nend Match end
  • $ End of line

Regex demo

If end and key are the only words in the line you could use:

^begin.*(?:\r?\n(?!(?:end|key)$).*)*\r?\nend$

Upvotes: 3

Related Questions