SylvainB
SylvainB

Reputation: 4820

Custom word boundaries in regular expression

I am trying to match words using a regular expression, but sadly the word boundary character (\b) does not include enough characters for my taste, so I want to add more. (in that precise case, the "+" character)

Here is what I used to have (it is C# but not very relevant) :

string expression = Regex.Escape(word);
Regex regExp = new Regex(@"\b" + expression + @"\b", RegexOptions.IgnoreCase);

This particular regex did not match "C++" and I thought it was a real bummer. So I tried using the \w character in a character class that way, along with the + character :

string expression = Regex.Escape(word);
Regex regExp = new Regex(@"(?![\w\+])" + expression + @"(?![\w\+])", RegexOptions.IgnoreCase);

But now, nothing gets matched... is there something I am missing?

Upvotes: 6

Views: 3496

Answers (1)

fge
fge

Reputation: 121830

(no need to escape the + in a character class)

The problem is that you use a negative lookahead first whereas you should use a negative lookbehind. Try:

@"(?<![\w+])" + expression + @"(?![\w+])"

Upvotes: 11

Related Questions