Reputation: 1949
First I have a regex cat(?=mouse\b)
it matches against cat
in a catmouse x
. But I want a version where there is a word boundary between cat
and the mouse
. So I tried these regex
cat(?=\bmouse\b)
cat\b(?=mouse\b)
But none of the above match cat
in a cat mouse x
. How do I accomplish this?
Upvotes: 0
Views: 1636
Reputation: 163517
You could add an optional space in the lookahead as there is no word boundary between catmouse
cat(?= ?mouse\b)
To match either a dot, comma or a space and not match catmouse
you could use a character class:
cat(?=[\s.,]mouse\b)
Explanation
cat
Match literall(?=
Postive lookahead, assert what is directly to the right is
[\s.,]
Match either a whitespace char, dot or commamouse\b
Match mouse
and a word boundary)
Close lookaheadIf you don't want cat to be part of a larger word, you might prepend a word boundary \bcat
Per the www.regular-expressions.info
linked documentation, there are three different positions that qualify as word boundaries:
Upvotes: 1
Reputation: 248
Oh you are so close :)
I think you didn't fully understand the word boundary \b
.
At the beginning of a string it will make sure your word does start with the characters which come after. \bmouse
will match every word starting with mouse.
Regex: '/\bmouse/'
Matches: Mouse, MouseMouse, MouseCat, Mouse...
Fails: CatMouse, MyMouse, EtcMouse
If \b
is put after a string, this makes sure it does not continue the word.
Regex: '/mouse\b/'
Matches: Mouse, MouseMouse, CatMouse, ...Mouse
Fails: MouseCat, MouseHouse, MouseEtc
Putting both together makes sure you have an enclosed word
Regex: '/\bmouse\b/'
Matches: Mouse
Fails: NoMouse, MouseNo, NoMouseNo
The \b
basically tells you about the combined word only. If you want something extra you need to mention that. The regex you want is probably this:
cat(?=[.,\ ]mouse\b)
Note: The first \b
was replaced by the characters you wanted to filter.
Upvotes: 1