Aximili
Aximili

Reputation: 29464

Regex not matching hash character between word boundary

I've got a strange behaviour with the regex \bC#?\b

string s1 = Regex.Replace("Bla Ca bla", @"\bCa?\b", "[$0]"); // Bla [Ca] bla (as expected)
string s2 = Regex.Replace("Bla C# bla", @"\bC#?\b", "[$0]"); // Bla [C]# bla (why???)

Does anyone understand why it happens and how to match an optional # at the end?

Upvotes: 3

Views: 1021

Answers (1)

Andrei
Andrei

Reputation: 56688

Because \b is marking the boundaries of the word. And in regexes word is considered a sequence of alphanumeric symbols (see here), other characters not included. In first example a is a letter, so Ca is a word. In second # is not an alphanumeric character, thus word consists of only C.

To see the difference, try removing \b:

string s2 = Regex.Replace("Bla C# bla", @"C#?", "[$0]"); // Bla [C#] bla

If you need \b kind of boundary - check out this thread with some suggestions.

Upvotes: 5

Related Questions