Reputation: 29464
I've got a strange behaviour with the regex \bC#?\b
string s1 = Regex.Replace("Bla Ca bla", @"\bCa?\b", "[$0]"); // Bla [Ca] bla (as expected)
string s2 = Regex.Replace("Bla C# bla", @"\bC#?\b", "[$0]"); // Bla [C]# bla (why???)
Does anyone understand why it happens and how to match an optional #
at the end?
Upvotes: 3
Views: 1021
Reputation: 56688
Because \b
is marking the boundaries of the word. And in regexes word is considered a sequence of alphanumeric symbols (see here), other characters not included. In first example a
is a letter, so Ca
is a word. In second #
is not an alphanumeric character, thus word consists of only C
.
To see the difference, try removing \b
:
string s2 = Regex.Replace("Bla C# bla", @"C#?", "[$0]"); // Bla [C#] bla
If you need \b
kind of boundary - check out this thread with some suggestions.
Upvotes: 5