Regular Expression for consecutive patterns

Question

I have this text:

a aa aaa aaa aaaa aa aaa

And i need to catch all the aaa sequences in the text, but ignore them if there is four in a row, like aaaa. In the ideal case, I would be able to detect this:

a aa **aaa**  **aaa** aaaa aa **aaa**

Currently I have this regular expression:

[^a]aaa[^a]

This works well with the first and the last sequence 'aaa', but it can't catch the second one, since the space between aaa aaa belongs to the first pattern.

a aa **aaa**  aaa aaaa aa **aaa**

Any ideas on how to make this regex?

Pi Marillion · Accepted Answer

I'll assume that you also want to catch the aaa if it's part of a sequence outside of spaces, e.g.

aaabbccaabccaccbbbaaaccbbaaaaccbbaacccaaab
^^^               ^^^                 ^^^

In this case, a negative lookaround would be your best bet:

re.findall('(?



(? means "not preceded by an a".


aaa matches your three as.

(?!a) means "not followed by an a".

Thus, the above only matches aaa without any additional as directly before or after the matching three.

Regular Expression for consecutive patterns

Answers (2)

Related Questions