Reputation: 1059
I need to identify substrings found in a string such as:
"CityABCProcess Test" or "cityABCProcess Test"
to yield :
[ "City/city", "ABC", "Process", "Test" ]
The regular expression we have been using is:
"[A-Z][a-z]+|([A-Z]|[0-9])+\b|[A-Z]+(?=[A-Z])|([a-z]|[0-9])+"
This has been working great but breaks in the case of a string:
"X-999"
We are implementing it in this fashion:
StringBuilder builder = new StringBuilder();
builder.Append("[A-Z][a-z]+|([A-Z]|[0-9])+\b|[A-Z]+(?=[A-Z])|([a-z]|[0-9])+");
foreach (Match match in Regex.Matches(name, builder.ToString()))
{
//do things with each match
}
The problem here is it is not matching on the 'X' but only the '999'. Any ideas? I tested it with regexr.com and it says this regex should match on both substrings.
Upvotes: 3
Views: 187
Reputation: 20873
\b
is being interpreted as an escape sequence (\u0008, backspace) in the C# string.
Escape the slash (i.e., \\b
), or use a verbatim string using the @
symbol:
builder.Append(@"[A-Z][a-z]+|([A-Z]|[0-9])+\b|[A-Z]+(?=[A-Z])|([a-z]|[0-9])+");
Upvotes: 4