Reputation: 55273
I want the regex to exclude some words. Something like this:
(?!\bhe\b|\bit\b)\w+
However, it's only excluding the first letters of these words. In this case, h
and i
.
Why is this and how to fix it?
Upvotes: 0
Views: 75
Reputation: 163352
The positive lookahead is not anchored, and will test the assertion before h
and e
. The first time it is false, but then it will test the assertion again on the position after the h
and before the e
Now the assertion is true as there is not he
directly to the right at that position, and it will match 1 or more word characters, being the the e
Placing the \b
before matching a word char makes sure the lookahead is triggered after first encountering a word boundary.
This way the assertion will not run between h
and e
because the word boundary will not match.
\b(?!he\b|it\b)\w+
Upvotes: 3