Reputation: 290
I need to isolate a portion of a string after the key %CONFIG\n
. Trailing newlines and other regions that begin with %
shall be stripped.
My example string:
Configuration File
Format
<Identifier>: <init> <start> <end> <step>
%CONFIG
Line A: 0 1000 5000 300
Line B: 0 0 200 20
%OPTIONAL_OTHER_KEY
some other definitions
where the only match should be:
Line A: 0 1000 5000 300
Line B: 0 0 200 20
Take everything after and including %OPTIONAL_OTHER_KEY
as optional content of the input string, which shall not be included in the match.
I've got already (?<=%CONFIG\n)[\w\W]*(?=%)
, but it does not strip trailing new-lines...
Upvotes: 1
Views: 57
Reputation: 626816
When you need to leave out some whitespace from the match, the generic subpattern that comes right before should be used with a lazy quantifier (if other means are not working), and the whitespace subpattern must be used with a greedy quantifier (well, in some languages, you should not mix lazy and greedy quantifiers, as in Tcl, I hope it is not the case here). It is something can be implemented quickly, but might require adjusting if any performance issues occur.
So, you can use
(?<=%CONFIG\n)[\w\W]*?(?=\s*%)
^ ^^^
See regex demo
Here, [\w\W]*?
is used with *?
lazy quantifier matching zero or more any characters but as few as possible. \s*
matching zero or more whitespace characters, as many as possible, is added to the lookahead so that it is not part of the match.
However, if you do not have %
after %CONFIG
, you need to use an unrolled lazy quantifier version.
(?<=%CONFIG\n)\S*(?:\s+[^\s%]\S*)*
See the demo
Upvotes: 1