Reputation: 414
Regexes in .NET (I’m using 4.5.2) appear to have three (non-static) Match methods:
regex.Match(string input)
searches for the first match in input
.regex.Match(string input, int startIndex)
searches for the first match in input
starting at startIndex
.regex.Match(string input, int startIndex, int length)
searches for the first match in a range of input
defined by startIndex
and length
.If I write
System.Text.RegularExpressions.Regex regex =
new System.Text.RegularExpressions.Regex("^abc");
string str = "abc abc";
System.Text.RegularExpressions.Match match = regex.Match(str);
System.Diagnostics.Debug.WriteLine(match.Success);
then I see that match.Success
is True
, as expected. The regex
matches the abc
at the beginning of str
.
If I then write
int index = 4;
match = regex.Match(str, index);
System.Diagnostics.Debug.WriteLine(match.Success);
to search from index 4 to the end of str
, then I see that match.Success
is False
, as expected. There’s an abc
at index 4 of str
, but index 4 is not the beginning of the string.
However, if I write
match = regex.Match(str, index, str.Length - index);
System.Diagnostics.Debug.WriteLine(match.Success);
System.Diagnostics.Debug.WriteLine(match.Index);
to again search from index 4 to the end of str
, then I see that match.Success
is unexpectedly True
, and match.Index
is 4. I would expect to get the same result as calling regex.Match(str, index)
.
Is there a way to get consistent start-of-string anchor behavior in .NET Regex Match methods?
Upvotes: 3
Views: 144
Reputation: 626961
From the comments in the Regex.cs source code, I see that public Match Match(String input, int startat)
finds the first match, starting at the specified position and public Match Match(String input, int beginning, int length)
finds the first match, restricting the search to the specified interval of the char array.
Combined with your test results (and mine), it is clear that the last overload of the Regex.Match
method takes the substring as a new, separate string and passes it to the regex engine. No changing ^
to \A
will help.
Thus, to know if the match is at the real start or not, you should just add logics to your own code, say, if index
is more than 0, all matches are not at the real start of the string. However, the index returned is correct, thus that looks like a bug to me.
Upvotes: 2