How to match every thing except some character ? regex

Question

I have these two strings:

s2 = start bla bla bla word1 bla bla bla word1 value1 word1 bla bla bla

s1= start bla bla bla word1 bla bla bla word1 bla bla bla word1 value1

I want to check if s has the value1, but it should be after the the secondword1. so s1 should return the value1, but s2 should return None because value1 come after the third word1.

I tried this:

re.search('start(.*?word1){2}\s+(value1)')

the problem that my search return value1 for the s1 and s2 ? because . include every thing !

Avinash Raj · Accepted Answer

Use a negative lookahead assertion like below. The below regex would capture the string value1 only if it's preceded by exactly two word1 strings from the start .

r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)'

(?:(?!word1).)* would match any character but not of word1 zero or more times. That is before matching a single character, regex engine would check for the character is not w followed by ord1. If there isn't, then only the regex engine would match the following character. This check would happen before matching each and every character. It would stops matching once it saw word1 string.

DEMO

>>> import re
>>> s2 = "start bla bla bla word1 bla bla bla word1 value1 word1 bla bla bla"
>>> s1= "start bla bla bla word1 bla bla bla word1 bla bla bla word1 value1"
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s2)
<_sre.SRE_Match object at 0x7f0bb60e9558>
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s2).group(1)
'value1'
>>> re.search(r'start(?:(?:(?!word1).)*word1){2}(?:(?!word1).)*?(value1)', s1)
>>>

How to match every thing except some character ? regex

Answers (2)

Related Questions