aarelovich
aarelovich

Reputation: 5566

Writing a QRegularExpression to only return the greedy last match

I have a input file that looks like this:

.SUBCKT (stuff including newline characters)
.ENDS (stuff)

.SUBCKT (stuff including newline characters)
.ENDS (stuff)

.SUBCKT (stuff including newline characters)
.ENDS (stuff)

.SUBCKT (stuff including newline characters)
.ENDS (stuff I don't want)

(What I want to get, includes a newline characters)

.END

So what I want to do is write a regular expression (using Qt so QRegularExpression) that will get me all the stuff that "I want to get" (between the last .ENDS and the .END without getting the "stuff I don't want". The stuff I don't want is after the last .ENDS and the first newline character after it.

My most successfull attempt at getting this has been this code:

QStringList toplevel;
QRegularExpression regexp_toplevel("\\.ENDS(.*?)\n(.*?)\\.END",QRegularExpression::DotMatchesEverythingOption);
QRegularExpressionMatchIterator toplevel_i = regexp_toplevel.globalMatch(contents);
while (toplevel_i.hasNext()){
    QRegularExpressionMatch match = toplevel_i.next();
    toplevel << match.captured(2);
}

The code above returns a list of strings and the last one is what I want. But since the iteration is done java-stlye I'm not 100% percent sure that what I want will ALWAYS be the last one. Is there any way I can write the expression getting ONLY what is between the last .ENDS after the the first newline and the .END?

Upvotes: 1

Views: 481

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627034

You may use

(?ms).*^\.ENDS(?-s:.*)\n(.*)\.END

See the regex demo

Details:

  • (?ms) - m will make ^ match a line start and s will make the . match line breaks
  • .* - will match any 0+ chars, as many as possible (greedily), up to the last occurrence of...
  • ^\.ENDS - an .ENDS substring at the start of a line
  • (?-s:.*) - the rest of the line (if any chars are still there)
  • \n - a newline (may be replaced with \\R to match any style line break)
  • (.*) - Group 1: any 0+ chars up, as many as possible, up to the last (if you need to match up to the first .END, add ? after *)
  • \.END - .END literal substring.

Declare as

QRegularExpression regexp_toplevel("(?ms).*^\\.ENDS(?-s:.*)\n(.*)\\.END");

The value you need will be inside match.captured(1).

Upvotes: 1

Related Questions