jpsimons
jpsimons

Reputation: 28100

Javascript regex engine: Word boundaries not matching at start of string for non-word characters

I thought \b matches at the transition between word and nonword characters, or at the start or end of the string. So this should be true:

'#abc'.match(/\b#/)

But it's null, at least in Firefox and Chrome. Any idea why?

Upvotes: 4

Views: 510

Answers (2)

NicolasB
NicolasB

Reputation: 1071

\b is equivalent to (^\w|\w$|\W\w|\w\W). You've probably read the following from the mozilla documentation:

A word boundary matches the position between a word character followed by a non-word character, or between a non-word character followed by a word character, or the beginning of the string, or the end of the string.

It isn't properly written. It should specify that it matches the beginning or end of string when adjacent to a word character. That's the issue with writing long sentences instead of using bullet points when trying to explain something pretty algorithmic: it's hard to read, and therefore hard to proofread.


Example of a correct definition from a source other than mozilla:

There are three different positions that qualify as word boundaries:

  • Before the first character in the string, if the first character is a word character.
  • After the last character in the string, if the last character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.

Upvotes: 5

Poul Bak
Poul Bak

Reputation: 10930

'#' is not a Word character, so there's no Word boundary to match at start of string. Simple as that.

If you delete the '#', so it's just 'abc', then '\b' will correctly match the Word boundary.

Upvotes: 0

Related Questions