Davy
Davy

Reputation: 113

RegEx - Match Words in Same Sentence with Negative Lookaround

I'm trying to match a word (good) if another word (bad) does not exist in the same sentence. I want to do this using lookaround as I want only the first word (good) to be included in the captured results.

Here's my regular expression:

(?<!\bbad\b[^.])\bgood\b(?![^.]+\bbad\b)

This does work in all cases except when the word I'm looking for (good) precedes the other word (bad).

So in the following examples, the results are as follows:

  1. TEST 1: A good example of a bad regex. (no matches - PASS)
  2. TEST 2: A bad example of a good regex. (match found - FAIL)
  3. TEST 3: A bad example. A good regex. (match found - PASS)
  4. TEST 4: A good example. A bad regex. (match found - PASS)

Can someone please point me to what I'm missing here? Here's my test on regex101.com.

Upvotes: 3

Views: 315

Answers (2)

anubhava
anubhava

Reputation: 785058

You may use this regex:

(?:^|\.)(?:(?!\b(?:bad|good)\b)[^.])*(\bgood\b)(?![^.]+\bbad\b)

RegEx Demo

RegEx Details:

  • (?:^|\.): Match start position or a dot
  • (?:(?!\b(?:bad|good)\b)[^.])*: Match a non dot character if doesn't have word good or bad ahead. Repeat this match 0 or more times
  • (\bgood\b): Match full word good
  • (?![^.]+\bbad\b): Negative lookahead to assert that we don't have one or more non-dot characters followed by the word, bad ahead of the current position

Upvotes: 4

The fourth bird
The fourth bird

Reputation: 163277

If a quantifier in the lookbehind is supported, you can optionally repeat the character class, exluding matching a newline.

(?<!\bbad\b[^.\n]*)\bgood\b(?![^.\n]+\bbad\b)

The pattern matches:

  • (?<!\bbad\b[^.\n]*) Negative lookbehind, assert to the left is not the word bad followed by optional chars exluding a . or newline
  • \bgood\b Match the word good
  • (?![^.\n]+\bbad\b) Negative lookahead, assert to the right is not optional chars excluding . or newline and the word bad

Regex demo

Upvotes: 2

Related Questions