Reputation: 29863
I have the following string:
123322
In theory, the regex 1.*2
should match the following:
12
(because *
can be zero characters) 12332
123322
If I use the regex 1.*2
it matches 123322
.
Using 1.*?2
, it will match 12
.
Is there a way to match 12332
too?
The perfect thing would be to get all possible matchess in the string (no matter if one match is substring of another)
Upvotes: 1
Views: 257
Reputation: 734
1(.*?2)*$
you will have multiple captures which you can concatenate to form all possible matches
see here:regex tester
click on 'table' and expand the captures tree
Upvotes: 1
Reputation: 138097
Generally, this isn't possible. A regex matching engine isn't really designed to find overlapping matches. A quick solution is simply to check the pattern on all substrings manually:
string text = "1123322";
for (int start = 0; start < text.Length - 1; start++)
{
for (int length = 0; length <= text.Length - start; length++)
{
string subString = text.Substring(start, length);
if (Regex.IsMatch(subString, "^1.*2$"))
Console.WriteLine("{0}-{1}: {2}", start, start + length, subString);
}
}
Working example: http://ideone.com/aNKnJ
Now, is it possible to get a whole-regex solution? Mostly, the answer is no. However, .Net does has a few tricks in its sleeve to help us: it allows variable length lookbehind, and allows each capturing group to remember all captures (most engines only return the last match of each group). Abusing these, we can simulate the same for
loop inside the regex engine:
string text = "1123322!";
string allMatchesPattern = @"
(?<=^ # Starting at the local end position, look all the way to the back
(
(?=(?<Here>1.*2\G))? # on each position from the start until here (\G),
. # *try* to match our pattern and capture it,
)* # but advance even if you fail to match it.
)
";
MatchCollection matches = Regex.Matches(text, allMatchesPattern,
RegexOptions.ExplicitCapture | RegexOptions.IgnorePatternWhitespace);
foreach (Match endPosition in matches)
{
foreach (Capture startPosition in endPosition.Groups["Here"].Captures)
{
Console.WriteLine("{0}-{1}: {2}", startPosition.Index,
endPosition.Index - 1, startPosition.Value);
}
}
Note that currently there's a small bug there - the engine doesn't try to match the last ending position ($), so you loose a few matches. For now, adding a !
at the end of the string solves that issue.
working example: http://ideone.com/eB8Hb
Upvotes: 1
Reputation: 153174
You would need a separate expression for each case, depending on the number of twos you want to match:
1(.*?2){1} #same as 1.*?2
1(.*?2){2}
1(.*?2){3}
...
Upvotes: 1
Reputation: 7961
No, unless there is something else added to the regex to clarify what it should do it will either be greedy or non-greedy. There is no in-betweeny ;)
Upvotes: 2