Reputation: 15672
What are non-word boundary in regex (\B), compared to word-boundary?
Upvotes: 42
Views: 20291
Reputation: 2703
The basic purpose of non-word-boundary
is to created a regex that says:
if we are at the beginning/end of a word char
(\w
= [a-zA-Z0-9_]
) make sure the previous/next character is also a word char
,
e.g.: "a\B."
~ "a\w"
:
"ab"
, "a4"
, "a_"
, ... but not "a "
, "a."
if we are at the beginning/end of a non-word char
(\W
= [^a-zA-Z0-9_]
) make sure the previous/next character is also a non-word char
,
e.g.: "-\B."
~ "-\W"
:
"-."
, "- "
, "--"
, ... but not "-a"
, "-1"
For word-boundary
it's similar but instead of making sure that the adjacent characters are of the same class (word char
/non-word car
) they need to differ, hence the name word's boundary
.
Upvotes: 4
Reputation: 838346
A word boundary (\b
) is a zero width match that can match:
\w
) and a non-word character (\W
) orIn Javascript the definition of \w
is [A-Za-z0-9_]
and \W
is anything else.
The negated version of \b
, written \B
, is a zero width match where the above does not hold. Therefore it can match:
For example if the string is "Hello, world!"
then \b
matches in the following places:
H e l l o , w o r l d !
^ ^ ^ ^
And \B
matches those places where \b
doesn't match:
H e l l o , w o r l d !
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
Upvotes: 107