Reputation: 14534
I am confused about /\w\b\w/. I think it should match "e w" in "we we", since:
\w is word character which is "e"
\b is word broundary which is " " (space)
\w is another word which is "w"
So the match is "e w" in "we we". But...
/\w\b\w/ will never match anything, because a word character can never be followed by both a non-word and a word character.
I got this one from MDN:
I can't understand their explanation. Can you help me explain it in baby step? Thank you!
Upvotes: 2
Views: 1282
Reputation: 1
I had same question. Reading this post, i finaly figured it out. The difficulty here may be that we imagine \b in \w\b\w as asymbol of space. But here and everywhere \b only points out "after or before" must be non-word (not represents the non-word symbol). And given last assertion, in case \w\b\w, last \w says "No! here is word-symbol". So last \w contradicts to \b. Well, take in account that \b is pointer, not a symbol-class. And for exercise prove, that for firs \w in \w\b\w all this true also :)
Upvotes: 0
Reputation: 1
use \w\s\w
to match what you need. note that \s
and \d
are different
Upvotes: -1
Reputation: 404
The key is the \b
meaning. \b
matches a word boundary. A word boundary matches the position where a word-character is not followed or preceded by another word-character. Note that a matched word boundary is not included in the match. In other words, the length of a matched word boundary is zero.
So \b
itself doesn't match anything, it's just a condition like ^
, $
and so on. Like /^\w/
mean start with word-character, /\w\b/
mean a word-character not followed by a word-character.
In "e w"
, /\w\b/
only match "e"
which a word-character not followed by a word-character in here is space, but not "e "
.
/\w\W/
do match "e "
in "e w"
. \b
just a condition don't match anything.
/\w\b\w/
is mean a word-character both followed by a non-word and a word-character is contradictory, so will never match anything.
Upvotes: 2
Reputation: 70732
Your regular expression would fail for the input "we we"
because a word boundary in most dialects is a position between \w
and a non-word character (\W
), or at the beginning or end of a string if it begins or ends with a word character.
Your regular expression is doing this:
\w word characters (a-z, A-Z, 0-9, _)
\b the boundary between a word char (\w) and not a word char
\w word characters (a-z, A-Z, 0-9, _)
Therefore, its saying look for a word character following the position of your word boundary. If you were to remove the ending \w
it would match the e
in your input.
console.log("we we".match(/\w\b/));
// => [ 'e', index: 1, input: 'we we' ]
Upvotes: 1
Reputation: 13906
\w\b\w
means match:
\w
); followed by\w
).The key point is that \b
doesn't consume any characters, it checks which characters are adjacent to the tested position. So \w\b\w
matches only two characters, both must be alphanumeric (\w
) and the imaginary point between them must have an alphanumeric on one side and non-alphanumeric on the other, which is therefore not possible to match.
Hope this helps.
Upvotes: 2
Reputation: 76736
The space character isn't the word boundary. A word boundary isn't a character itself, it's the place "in between characters" where a word character transitions to a non-word character.
So "e w".match(/\w\b/)
only matches "e"
, not "e "
.
/\w\b\w/
never matches anything because it would require that a word character be immediately followed by a non-word character and also by a word character, which is of course not possible.
Upvotes: 5