mjharper
mjharper

Reputation: 145

Regex match first group with certain text

I'm trying to match blocks of text that contain certain text within them. Each block is clearly defined by standard start/end text patterns.

In the below example I want to match steps 1 and 3 from the "step start" to "step end" as they contain the text "database:dev". However my current regex matches step 1 fine, but then matches steps 2 and 3 in a single match. It's probably easier to see with an example here: https://regex101.com/r/56tfOQ/3/

I need to specify that each match can only contain one "step start", but I can't work out how to do that.

The regex I'm currently using is:

(?msi)step start.*?database:dev.*?step end

An example of the text is:

step start
    name:step1
    database:dev1
step end
step start
    name:step2
    database:test1
step end
step start
    name:step3
    database:dev2
step end
step start
    name:step4
    database:test2
step end

Upvotes: 2

Views: 150

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626774

In a common scenario, you may use a tempered greedy token like (?:(?!<STOP_PATTERN>).)*? in between the starting delimiter and some third string that should appear in between delimiters.

You might write your regex as

(?si)step start(?:(?!step start).)*?database:dev.*?step end

However, it seems your opening delimiter is at the start of a line. Then it makes sense to use

(?msi)^step start(?:(?!^step start).)*?database:dev.*?step end

See the regex demo

Regex graph:

enter image description here

Details

  • (?msi) - multiline, dotall and case insensitive modes are on
  • ^ - line start (since m option is on)
  • step start - starting delimiter
  • (?:(?!^step start).)*? - a tempered greedy token that matches any char, 0+ occurrences/repetitions, as few as possible, that does not start a step start char sequence at the start of a line
  • database:dev - a literal substring
  • .*? - any 0+ chars, as few as possible
  • step end - ending delimiter.

Upvotes: 2

Related Questions