Reputation: 11
Using capture groups to get specific pieces of a string. It has worked before, but now that I have optional character groups using ? optional param I am getting weird results.
I am attempting to capture Critical care medicine as a capture group within a string, allow CRIT abbreviation and Medicine optional. Exclude capture group if followed by an "and".
https://regex101.com/r/MeWB7J/1
REGEX: .*((?:(?:\bCRIT(?:ICAL)?\W*CARE\W*(?:MEDICINE)?))(?!\sAND)).*
If I pass it CRITICAL CARE MEDICINE, CRITICAL CARE, or CRIT CARE works fine and I get back expected results in my capture group. However if I pass "CRITICAL CARE MEDICINE AND", my capture group will be "CRITICAL CARE". If I pass "CRIT CARE AND", I get "CRIT CARE". I'm lost on why the negative lookahead isn't working and is being treated as essentially an ignore that part of the pattern.
Upvotes: 1
Views: 23
Reputation: 163632
You can optionally capture MEDICINE in the capture group, if after matching care there is no MEDICINE followed by AND
Note that \W
and \s
can also match a newline.
.*\b(CRIT(?:ICAL)?\W*CARE\b(?!(?:\s+MEDICINE\b)?\s+AND\b)(?:\s+MEDICINE\b)?).*
The pattern matches:
.*
Match the whole line\b
A word boundary to prevent a partial word match(
Capture group 1
CRIT(?:ICAL)?
Match CRIT or CRITICAL\W*CARE\b
Match optional non word chars and then match the word CARE(?!
Negative lookhead, assert what is directly to the righ tis not
(?:\s+MEDICINE\b)?
Optionally match MEDICINE
\s+AND\b
Match AND
(?:\s+MEDICINE\b)?
Optionally match MEDICINE
)
Close group 1.*
Match the rest of the lineSee a regex demo.
Upvotes: 1