ormxmi
ormxmi

Reputation: 43

regex - multiple matches after a specific word

Simplified example: consider the string aabaabaabaabaacbaabaabaabaa

I want to match all aa occurrences only after the c in the middle, using one regex expression.

The closest I've come to is c.*\Kaa but it only matches the last aa, and only the first aa with the ungreedy flag.

I'm using the regex101 website for testing.

Upvotes: 4

Views: 1056

Answers (2)

Cary Swoveland
Cary Swoveland

Reputation: 110665

If it is known that the string contains exactly one 'c' just match

aa(?!.*c)

Demo

(?!.*c) is a negative lookahead that asserts that 'c' does not appear later in the string.


If it is not known whether the string contains zero, one or more than one 'c', and 'aa' is to be matched if and only if the string contains at least one 'c' and 'aa' is not followed later in the string by a 'c', one can match the regular expression

^.*c\K|(?!^)aa

Demo

The regular expression can be broken down as follows.

^      # match the beginning of the string
.*     # match zero or more chars, as many as possible
c      # match 'c'
\K     # reset match pointer in string and discard all previously
       # matched characters
|      # or
(?!^)  # negative lookahead asserts current string position is not
       # at the beginning of the string
aa     # match 'aa'

Note that is the string contains no 'c' there will be no match.

Upvotes: 1

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626699

You can use

(?:\G(?!^)|c).*?\Kaa

See the regex demo. Details:

  • (?:\G(?!^)|c) - either the end of the previous successful match (\G(?!^)) or (|) a c char
  • .*? - any zero or more chars other than line break chars, as few as possible
  • \K - forget the text matched so far
  • aa - an aa string.

Upvotes: 3

Related Questions