Peter J
Peter J

Reputation: 147

Match words and ignore numbers with Regex

I have a list of words, for example, "at", "in", "on". I need to match any of those exact words but I'm having some difficulty with one part of it.

Examples:

"I am at work" - should match with "at"

"I am attracting honey bees" - should not match

"I am at123" - should match

I currently have something like this, but it's not doing exactly what i need.

(?i)(\W|^)(at|in|on)(\W|$)

Any assistance is appreciated

Upvotes: 1

Views: 537

Answers (2)

The fourth bird
The fourth bird

Reputation: 163467

Using \W means not a word character, and it seems you want to allow that on the left and right, except for a digit and without actually matching it.

If you want to match words delimited by word boundaries except for a digit, you can use lookaround assertions and match a non word character except for a digit using a negated character class [^\W\d]

(?<![^\W\d])(?:at|[io]n)(?![^\W\d])

Explanation

  • (?<![^\W\d]) Negative lookbehind, assert not a word character to the left except for a digit
  • (?:at|[io]n) Match either at or in or on
  • (?![^\W\d]) Negative lookahead, assert not a word character to the right except for a digit

See a regex demo.

If you don't want to allow a digit to the left and only to the right, you can us a word boundary on the left side of the pattern instead:

\b(?:at|[io]n)(?![^\W\d])

See another regex demo.

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522050

It appears that you want to match these words when surrounded by either whitespace, numbers, or the start/end of the string. In that case, we can try using the following pattern:

(?:(?<!\S)|(?<!\D))(?:at|in|on)(?:(?!\S)|(?!\D))

This pattern says to:

  • (?:
    • (?<!\S) lookbehind and assert whitespace or the start of the string precedes
    • | OR
    • (?<!\D) lookbehind and assert that a digit or the start of the string precedes
  • )
  • (?:at|in|on) match at, in, or on
  • (?:
    • (?!\S) lookahead and assert whitespace or the start of the string follows
    • | OR
    • (?!\D) lookahead and assert that a digit or the start of the string precedes
  • )

Demo

Upvotes: 1

Related Questions