Saphire
Saphire

Reputation: 1930

How to capture groups of strings with indentation?

I would like to capture every if-construct in this string as a single one

if a > b
    do this
    do that
    if a == c
        do this
        do that

I would like to have a match for

if a > b
    do this
    do that

and

    if a == c
        do this
        do that

What I have so far doesn't seperate between new if-constructs

if(\W+\w+)+\n\t

Upvotes: 1

Views: 76

Answers (2)

anubhava
anubhava

Reputation: 785098

You can use this lookahead based regex:

^(\s*if[\s\S]+?)(?=^\s*if|\z)

in MULTILINE mode.

RegEx Demo

[\s\S]+ will match 1 more characters including newlines and (?=^\s*if|\z) is a lookahead that will make sure to assert that next to current match is another if block or end of input.

Upvotes: 3

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51330

Whatever you're trying to do, you should consider writing a parser. It'll keep things simpler for you in the long run.

If you insist on using a regex, well...

^([ ]*)if.+\r?\n(\1[ ]+).+(?:\r?\n\2(?!if).+)*

Demo

Usage:

var re = new Regex(@"^([ ]*)if.+\r?\n(\1[ ]+).+(?:\r?\n\2(?!if).+)*", RegexOptions.Multiline);

Let's split it up into pieces:

  • ^([ ]*)if.+\r?\n captures a first if with leading spaces until a line break
  • (\1[ ]+).+ captures the next line. It requires more spaces (so it's indented).
  • (?:\r?\n\2(?!if).+)* captures the next lines up until the next if. It requires the same amount of spaces as the first line after the if.

Upvotes: 1

Related Questions