Reputation: 558
I am writing a regular expression in Visual Studio 2013 using C#
I have the following scenario:
Match match = Regex.Match("%%Text%%More text%%More more text", "(?<!^)%%[^%]+%%");
But my problem is that I don't want to capture groups. The reason is that with capture groups match.Value
contains %%More text%%
and my idea is the get on match.Value
directly the string: More text
The string to get will be always between the second and the third group of %% Another approach is that the string will be always between the fourth and fifth %
I tried:
Regex.Match("%%Text%%More text%%More more text", "(?:(?<!^)%%[^%]+%%)");
But with no luck.
I want to use match.Value because all my regex are in a database table.
Is there a way to "transform" that regex to one not using capturing groups and the in match.value
the desired string?
Upvotes: 1
Views: 932
Reputation: 627469
If you are sure you have no %
s inside double %%
s, you can just use lookarounds like this:
(?<=^%%[^%]*%%)[^%]+(?=%%)
^^^^^^^^^^^^^^ ^^^^^
If you have single-% delimited strings (like %text1%text2%text3%text4%text5%text6
, see demo):
(?<=^%[^%]*%)[^%]+(?=%)
See regex demo
And in case it is between the 4th and the 5th:
(?<=^%%(?:[^%]*%%){3})[^%]+(?=%%)
^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^
For single-% delimited strings (see demo):
(?<=^%(?:[^%]*%){3})[^%]+(?=%)
See another demo
Both the regexps contain a variable-width lookbehind and the same lookahead to restrict the context the 1 or more characters other than %
appears in.
The (?<=^%%[^%]*%%)
makes sure the is %%[something_other_then_%]%%
right after the beginning of the string, and (?<=^%%(?:[^%]*%%){3})
matches %%[substring_not_having_%]%%[substring_not_having_%]%%[substring_not_having_%]%%
after the string start.
In case there can be single %
symbols inside the double %%
, you can use an unroll-the-loop regex (see demo):
(?<=^%%(?:[^%]*(?:%(?!%)[^%]*)*%%){3})[^%]*(?:%(?!%)[^%]*)*(?=%%)
Which is matching the same stuff that can be matched with (?<=^%%(?:.*?%%){3}).*?(?=%%)
. For short strings, the .*?
based solution should work faster. For very long input texts, use the unrolled version.
Upvotes: 2