ggutenberg
ggutenberg

Reputation: 7360

Regular Expression Word Boundary and Special Characters

I have a regular expression to escape all special characters in a search string. This works great, however I can't seem to get it to work with word boundaries. For example, with the haystack

add +

or

add (+)

and the needle

+

the regular expression /\+/gi matches the "+". However the regular expression /\b\+/gi doesn't. Any ideas on how to make this work?

Using

add (plus)

as the haystack and /\bplus/gi as the regex, it matches fine. I just can't figure out why the escaped characters are having problems.

Upvotes: 5

Views: 17334

Answers (3)

tchrist
tchrist

Reputation: 80384

Boundaries are very conditional assertions; what they anchor depends on what they touch. See this answer for a detailed explanation, along with what else you can do to deal with it.

Upvotes: 0

Alan Moore
Alan Moore

Reputation: 75222

\b is a zero-width assertion: it doesn't consume any characters, it just asserts that a certain condition holds at a given position. A word boundary asserts that the position is either preceded by a word character and not followed by one, or followed by a word character and not preceded by one. (A "word character" is a letter, a digit, or an underscore.) In your string:

add +

...there's a word boundary at the beginning because the a is not preceded by a word character, and there's one after the second d because it's not followed by a word character. The \b in your regex (/\b\+/) is trying to match between the space and the +, which doesn't work because neither of those is a word character.

Upvotes: 7

riwalk
riwalk

Reputation: 14223

Try changing it to:

/\b\s?+/gi

Edit:

Extend this concept as far as you want. If you want the first + after any word boundary:

/\b[^+]*+/gi

Upvotes: 0

Related Questions