john_black
john_black

Reputation: 227

regular expression negative group

I'm trying, in a regular expression, to match and capture any word that contains "ball" without having "foot" or "basket" in front of it. for example, I want to match "volleyball" and "dodgeball" but not "basketball" or "football". important thing is that I can't use a positive group but a negative one. what I tried:

[^(?:foot|basket)(ball)]

!(?:foot|basket)(ball)

finding the opposite is rather simple:

(?:foot|basket)(ball)

but that's not what I'm looking for. I need it the other way around.

EDIT: this is php, it's a "preg_replace" command.

Upvotes: 0

Views: 156

Answers (2)

user557597
user557597

Reputation:

I would isolate all substring's ball, then enforce no foot or basket behind it.

\b(?:(?!ball)\w)*(?:(?<!foot)(?<!basket)ball(?:(?!ball)\w)*)+\b
or, I think pcre can do the assertion this way
\b(?:(?!ball)\w)*(?:(?<!foot|basket)ball(?:(?!ball)\w)*)+\b

Formatted:

 \b 
 (?:
      (?! ball )
      \w 
 )*
 (?:
      (?<! foot )
      (?<! basket )
      ball
      (?:
           (?! ball )
           \w 
      )*
 )+
 \b 

Upvotes: 0

Charles Duffy
Charles Duffy

Reputation: 295443

PHP uses PCREs. Thus, negative lookbehind syntax is available:

(?<!foot|basket)ball

Upvotes: 3

Related Questions