Dulguun Otgon
Dulguun Otgon

Reputation: 1949

Regex look ahead/behind with word boundary

First I have a regex cat(?=mouse\b) it matches against cat in a catmouse x. But I want a version where there is a word boundary between cat and the mouse. So I tried these regex

But none of the above match cat in a cat mouse x. How do I accomplish this?

Upvotes: 0

Views: 1636

Answers (2)

The fourth bird
The fourth bird

Reputation: 163517

You could add an optional space in the lookahead as there is no word boundary between catmouse

cat(?= ?mouse\b)

Regex demo

To match either a dot, comma or a space and not match catmouse you could use a character class:

cat(?=[\s.,]mouse\b)

Explanation

  • cat Match literall
  • (?= Postive lookahead, assert what is directly to the right is
    • [\s.,] Match either a whitespace char, dot or comma
    • mouse\b Match mouse and a word boundary
  • ) Close lookahead

Regex demo

If you don't want cat to be part of a larger word, you might prepend a word boundary \bcat


Per the www.regular-expressions.info linked documentation, there are three different positions that qualify as word boundaries:

  • Before the first character in the string, if the first character is a word character.
  • After the last character in the string, if the last character is a word character.
  • Between two characters in the string, where one is a word character and the other is not a word character.

Upvotes: 1

Phy
Phy

Reputation: 248

Oh you are so close :)
I think you didn't fully understand the word boundary \b.

How \b works in regex

At the beginning of a string it will make sure your word does start with the characters which come after. \bmouse will match every word starting with mouse.

Regex: '/\bmouse/'
Matches: Mouse, MouseMouse, MouseCat, Mouse...
Fails: CatMouse, MyMouse, EtcMouse

If \b is put after a string, this makes sure it does not continue the word.

Regex: '/mouse\b/'
Matches: Mouse, MouseMouse, CatMouse, ...Mouse
Fails: MouseCat, MouseHouse, MouseEtc

Putting both together makes sure you have an enclosed word

Regex: '/\bmouse\b/'
Matches: Mouse
Fails: NoMouse, MouseNo, NoMouseNo

Results

The \b basically tells you about the combined word only. If you want something extra you need to mention that. The regex you want is probably this:

cat(?=[.,\ ]mouse\b)

Note: The first \b was replaced by the characters you wanted to filter.

Upvotes: 1

Related Questions