RegEx consecutive matches

Question

I have this regex in Javascript to remove words with 3 letters or less:

srcText = srcText.replace(/\s[a-z]{1,3}\s/gi,'');

It works but when two consecutives matches are found, the 2nd isn't affected:

Ex.:

"... this is one sample of a text ... "

' one ' and ' a ' won't be affected unless I run the code one more time:

srcText = srcText.replace(/\s[a-z]{1,3}\s/gi,'');

So I'd have to run the code n times, n being the consecutives matches in srcText.

for testing purpose:

http://regexpal.com/

sample text:

http://www.gutenberg.org/files/521/521-0.txt (say, 4th paragraph)

Is my regex missing something or javascript won't allow this kind of recursiveness?

Dave · Accepted Answer

JavaScript's regular expressions (and most others too) support the \b escape sequence, which matches (zero-width) word boundaries. In your expression, simply replace the two \s with \b and it will work.

As noted by Sam in the comments, a word boundary is identified as:

(^\w|\w\W|\W\w|\w$)

that is, a non-word character followed by a word character, or a word character followed by a non-word character, where the start and end of the string are taken as non-word characters. (but note that \b is zero-width, so it isn't just a shorthand for that expression)

RegEx consecutive matches

Answers (2)

However...

Related Questions